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[57] ABSTRACT 

A method and apparatus for synchronizing audio and video 
data streams in a computer system during a multimedia 
presentation to produce a correctly synchronized presenta- 
tion. The preferred embodiment of the invention utilizes a 
nonlinear feedback method for data synchronization. The 
method of the present invention periodically queries each 
driver for the current audio and video position (or frame 
number) and calculates the synchronization error. The syn- 
chronization error is used to detennin^a^ tempo rvalue 
adjustment to one of the . date stream desired to jplace^e 
video and audio back in sync The method then adjusts^the 
audi o^or video tempo to maintain me audio and video data 
streams;in synchrony,. In the preferred embodiment of the 
invention, the video tempo is changed nonlinearly over time 
to achieve a match between the video position and the 
equivalent audio position. The method applies a smoothing 
function to the determined tempo value to prevent overcom- 
pensation. The method of the present invention can operate 
in any hardware system and in any software environment 
and can be adapted to existing systems with only minor 
modifications. 

40 Claims, 7 Drawing Sheets 
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METHOD AND APPARATUS FOR 
SYNCHRONIZING AUDIO AND VIDEO DATA 
STREAMS IN A MULTIMEDIA SYSTEM 

FIELD OF THE INVENTION 

The present invention relates generally to multimedia 
computer systems, and more particularly to a method and 
apparatus for synchronizing video and audio data streams in 
a computer system during a multimedia presentation. 

DESCRIPTION OF THE RELATED ART 

Multimedia computer systems have become increasingly 
popular over the last several years due to their versatility and 
their interactive presentation style. A multimedia computer 
system can be defined as a computer system having a 
combination of video and audio outputs for presentation of 
audio-visual displays. A modem multimedia computer sys- 
tem typically includes one or more storage devices such as 
an optical drive, a CD-ROM, a hard drive, a videodisc, or an 
audiodlsc, and audio and video data are typically stored on 
one or more of these mass storage devices. In some file 
formats the audio and video are interleaved together in a 
single file, while in other formats the audio and video data 
are stored in different files, many times on different storage 
media. Audio and video data for a multimedia display may 
also be stored in separate computer systems mat are net- 
worked together. In this instance, the computer system 
presenting the multimedia display would receive aportion of 
the necessary data from the other computer system via the 
network cabling. 

A multimedia computer system also includes a video card 
such as a VGA (Video Graphics Array) card which provides 
output to a video monitor, and a sound card which provides 
audio output to speakers. A multimedia computer system 
may also include a video accelerator card or other special- 
ized video processing card for performing video functions, 
such as compression, decompression, etc. When a computer 
system displays a multimedia presentation, the computer 
system microprocc s s or read s the audio and video data stored 
on the respective mass storage devices, or received from the 
other computer system in a distributed system, and provides 
the audio stream through the sound card to the speakers and 
provides the video stream through the VGA card and any 
specialized video processing hardware to the computer 
video monitor. Therefore, when a computer system presents 
^=*an audio-visual display, the audio data stream is decoupled 
from the video data stream, and the audio and video data 
streams are processed by separate hardware subsystems. 

A multimedia computer system also includes an operating 
system and drivers for controlling the various hardware 
elements used to create the multimedia display. For 
example, a multimedia computer includes an audio driver or 
sound card driver for controlling the sound card and a video 
driver for controlling the optional video processing card. 
One example of an operating system which supports mul- 
timedia presentations is the Multimedia Extensions for the 
Microsoft Windows operating system. 

Graphic images used in Windows multimedia applica- 
tions can be created in either of two ways, these being 
bit-mapped images and vector-based images. Bit-mapped 
images comprise a plurality of picture elements (pixels) and 
are created by assigning a color to each pixel inside the 
image boundary. Most bit-mapped color images require one 
byte per pixel for storage, so large bit-mapped images create 
correspondingly large flies. For example, a full-screen, 256- 
color image in 640-by-480-pixel VGA mode requires 307, 
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200 bytes of storage, if the data is not compressed. Vector- 
based images are created by defining the end points, 
thickness, color, pattern and curvature of lines and solid 
objects comprised within the image. Thus, a vector-based 

5 image includes a definition which consists of a numerical 
representation of the coordinates of the object, referenced to 
a corner of the image. 

Bit-mapped images are the most prevalent type of image 
storage format, and the most common bit-mapped-image file 

10 formats are as follows. A file format referred to as BMP is 
used for Windows bit-map files in 1-, 2-, 4-, 8-, and 24-bit 
color depths. BMP files contain a bit-map header mat defines 
the size of the image, the number of color planes, the type 
of compression used (if any), and the palette used. The 

15 Windows DIB (device-independent bit-map) format is a 
variant of the BMP format that includes a color table 
defining the RGB (red green blue) values of the colors used. 
Other types of bit-map formats include the TTF (tagged 
image format file), the PCX (Zsoft Personal Computer 

20 Paintbrush Bitmap) file format, the GIF (graphics inter- 
change file) format, and the TGA (Texas Instruments 
Graphic Architecture) file format 

The standard Windows format for bit-mapped images is a 
256-colar device-independent bit map (DIB) with a BMP 

25 (the Windows bit-mapped file format) or sometimes a DIB 
extension. The standard Windows format for vector-based 
images is referred to as WMF (Windows meta file). 

Compression 

30 

Full-motion video implies that video images shown on the 
computer's screen simulate those of a television set with 
identical (30 frames-per-second) frame rates, and that these 
images are accompanied by high-quality stereo sound A 

33 large amount of storage is required for high-resolution color 
images, not to mention a full-motion video sequence. For 
example, a single frame of NTSC video at 640-by-40 0-pixel 
resolution with 16-bit color requires 512K of data per frame. 
At 30 flames per second, over 15 Megabytes of data storage 

4Q are required for each second of full motion video. Due to the 
large amount of storage required for full motion video, 
various types of video compression algorithms are used to 
reduce the amount of necessary storage. Video compression 
can be performed either in real-time, Le., on the fly during 

45 video capture, or on the stored video file after the video data 
has been captured and stored on the media. In addition, 
different video compression methods exist for still graphic 
images and for full-motion video. 
Examples of video data compression for sail graphic 

50 images are RLE (run-length encoding) and JPEG (Joint 
Photographic Experts Group) compression. RLE is the stan- 
dard compression method for Windows BMP and DIB files. 
The RLE compression method operates by testing for dupli- 
cated pixels in a single line of the bit map and stores the 

55 number of consecutive duplicate pixels rather than the data 
for the pixel itself. JPEG compression is a group of related 
standards that provide either lossless (no image quality 
degradation) or lossy (rmperceptible to severe degradation) 
compression types. Although JPEG compression was 

€o designed for the compression of still images rather than 
video, several manufacturers supply JPEG compression 
adapter cards for motion video applications. 

In contrast to compression algorithms for still images, 
most video compression algorithms are designed to com- 

65 press full motion video. Video compression algorithms for 
motion video generally use a concept referred to as inter- 
frame compression, which involves storing only the differ- 
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ences between successive frames in the data file. Interframe corresponding audio and video frames are originally tagged 

compression begins by digitizing the entire image of a key with the same number. Therefore, since the frames are 

frame. Successive frames are compared with the key frame, initially received simultaneously, the frames can actually be 

and only the differences between the digitized data from the preprocessed so that tag cedes are placed into the header 

key frame and from the successive frames are stored. 5 files of the audio and the video for tracking the frame 

Periodically, such as when new scenes are displayed, new number and position of the audio and video tracks, 

key frames are digitized and stored, and subsequent com- In the AVI format, the audio and video information are 

parisons begin from this new referei^f omt.^It is noted that interleaved (alternated in blocks) in the CD-ROM to mini- 

interframe compression ratios are content-dependent, ie., if rnize delays that would result from using separate tracks for 

the video clip being compressed includes many abrupt scene 1Q video and audio information. Also, the audio and video data 

transitions from one image to another, the compression is are interleaved to synchronize the data as it is stored on the 

less efficient Examples of video compression which use an system. This is done in an attempt to synchronize the audio 

interframe compression technique are MPEG, DVI and and video data during playback. 

Indeo, among others. The Apple QuickTime format was developed by Apple for 

MPEG (Moving Pictures Experts Group) compression is 15 displaying animation and video on Macintosh computers, 

a set of methods for compression and decompression of full ^ has become a de facto multimedia standard. Apple's 

motion video images that uses the interframe compression Quicklime and Microsoft's AVI take a parallel approach to 

technique described above. The MPEG standard requires ^ Presentation of video stored on CD-ROMs, and the 

that sound be recorded simultaneously with the video data, Performance of the two systems is similar. The QuickTime 

and the video and audio data are interleaved in a single file on format, like AVI, uses software compression and decom- 

to attempt to maintain the video and audio synchronized 2 ° ? es .f on . ^dimques but also can employ hardware devices, 

j„ • u utu j. j 4 . ta . „ J . similar to those employed by DVL to speed processing. The 

during ^yback^e audio data is typically compressed as Apple QuicmneftnLt beLme available forthe PC under 

well, and the MPEG standard specifies an audio compres- Mtaosoft Windows in late 1992. 

^° d ^T2 to w AD ^i ( ^ PdVt differential As mentionecl ^ ^ au(Uo and video ^ streams in 

Pulse Code Modulation) for audio data. 25 a multimedia presentation are processed by separate hard- 

A standard referred to as Digital Video Interactive (DVT) ware subsystems under the control of separate device driv- 

f ormat developed by Intel Corporation is a compression and crs. The audio and video data are separated into separate data 

storage format for full-motion video and high-fidelity audio streams that are then transmitted to separate audio and video 

data. The DVI standard uses interframe compression tech- subsystems. The video data is transmitted to the video 

niques similar to that of the MPEG standard and uses 30 subsystem for display, and the audio data is transmitted to 

ADPCM compression for audio data. The compression the sound subsystem for broadcast These two subsystems 

method used in DVI is referred to as KTV 2.0 (real time are addressed by separate drivers, and each driver is loaded 

video), and this compression method is incorporated into dynamically by the operating system during a multimedia 

IhtePs AVK (audio/video kernel) software for its DVI prod- presentation. In an operating system that is multi-tasking, 

uct line. IBM has adopted DVI as the standard for displaying 35 has multiple drivers, or has multiple windows, the time 

video for its Ultimedia product line. The DVI file format is period between the servicing of drivers is indeterminate. If 

based on the Intel i750 chipset and is supported through the a driver is not serviced by the operating system in time for 

Media Control Interface (MO) for Windows. Microsoft and the next frame, a portion of the multimedia systems may 

Intel jointly announced the creation of the DV MCI (digital stall, resulting in the audio not being synchronized with the 

video media control interface) command set for Windows 40 video. When the audio and video portions of a multimedia 

3.1 in 1992. presentation become unsynchronized, many times this lack 

The Microsoft Audio 'Video Interleaved (AVI) format is a of synchronization is noticeable to the viewer, resulting in a 

special compressed file structure format designed to enable less pleasing display. One result of audio and video data 

video images and synchronized sound stared on CD-ROMs being out of sync is that the viewer may hear words that do 

to be played on PCs with standard VGA displays and audio 45 not match the lips of the speaker, a situation commonly 

adapter cards. The AVI compression method uses an inter- called "out of lip sync" 

frame method, i.e„ the differences between successive Therefore, many times the corresponding audio and video 
frames are stored in a manner similar to the compression frames of a multimedia presentation are not played synchro- 
methods used in DVI and MPEG. The AVI format uses nously together. The reasons far the audio and video data 
symmetrical software compression-decompression 50 streams f ailing out of sync during a presentation include the 
techniques, ie., both compression and decompression are inherent decoupling of the audio and video data streams in 
performed in real time. Thus AVI files can be created by separate subsystems in conjunction with system bottlenecks 
recording video images and sound in AVI format from a and performance issues associated with the large amounts of 
VCR or television broadcast in real time, if enough free hard data that are required to be manipulated during a multimedia 
disk space is available. 55 presentation. As mentioned above, full motion video clips 
In the AVI format, data is organized so that coded frame with corresponding audio require massive amounts of sys- 
numbers are located in the middle of an encoded data file tern resources to process. However, a considerably greater 
containing the compressed audio and compressed video. The amount of processing is required to display the video data 
digitized audio and video data are organized into a series of than is required for the audio data. First the video data must 
frames, each having header information. Each frame of the 60 be decompressed either in software or in a codec 
audio and video data streams is tagged with a frame number (compression-decompression) device. If the color depth of 
that typically depends upon the frame rate. For example, at the video is higher than that of the display, such as when an 
every 33 rnilliseconds (ras) or a 30th of a second, a frame AVI file with 16 bit video is played on an 8 bit display, the 
number is embedded in the header of the video frame and at computer must dither colors to fit within the display's color 
every 30th of a second, or 33 ms, the same frame number is 65 restrictions. Also, if the selected playback window size is 
embedded in the header of the audio track. The number inconsistent with the resolution at which the video was 
assigned to the frames is, therefore, coordinated so that the captured, the computer is required to scale each frame. 



11/14/2003, EAST Version: 1.4.1 



5,642, 

5 

In addition to the greater amount of processing required 
for video data, the amount of video processing can vary 
considerably, thus further adversely affecting synchro niza- . 
tion. For example, one variable that affects the speed of 
video playback is the decompres sion performed on the video 5 
data. The performance of software decompression algo- 
rithms can vary for a number of reasons. For example, due 
to the interframe method of compressing data, the number of 
bytes that comprise each video frame is variable, depending 
on how similar the prior video frame is to the current video 1Q 
frame. Thus, more time is required to process a series of 
frames in which background is moving than is required to. 
process a series of frames containing only minor changes in 
the foreground. Other variables include whether the color 
depth of the video equals that of the display and whether the 5 
selected playback window size is consistent with the reso- 
lution at which the video was captured, as mentioned above. 

In addition, a slow CPU adversely affects every stage in 
the processing of a video file for playback- A sluggish hard 
disk or CD-ROM controller can also adversely affect per- ^ 
f ormanee as can the performance of fee display controller or 
video card. Also, other demands can be made on the system 
as a result of something as simple as a mouse movement 
While the above processing is being performed on the video 
and audio data, and while other demands are made on 25 
system resources, it becomes very difficult to ensure that the 
audio and video data remain in synchronization. 

Video for Windows includes a method which presumably 
attempts to maintain the audio and video portions of a 
multimedia display in sync, ie., attempts to adapt when the 3 q 
computer system cannot keep up with either the video or 
audio portions of the display. Video for Windows bench- 
marks the video hardware when it first begins execution as 
well as every time thereafter that the default display is 
changed. The results of these tests are used to determine a 35 
particular system's baseline display performance at various 
resolutions and color depths. Video for Windows then uses 
this information regarding the capabilities of the video 
system to adjust the video frame rate to match the bench- 
marked performance for the default display. 'Video for 40 
Windows maintains the continuity of the audio at all costs 
because a halting audio track is deemed more distracting. 
When the burden of the video playback is such that the 
system cannot keep up, Video far Windows skips frames 
during playback or adjusts the frame rate continuously as the 45 
system's resource usage patterns change. 

However, the method used by video for Windows in 
adjusting the video rate to match the benchmarked perfor- 
mance of the default display results in an average frame rate 
suitable for the benchmark determined at the time the default 50 
was last changed. Attempts to display video frames contain- 
ing an unusually heavy amount of non-repetitive data will 
slow processing down to the point where the benchmarked 
frame rate is no longer useful When this happens, video 
frames are skipped because the burden of processing the 55 
video data becomes too great to preserve lip-sync in the 
display. The result can be "jerky" movement of the images 
of persons speaking as noted in Discover Windows 3.1 
Multimedia, by Roger Jennings (Que Corp. 1992), p. 
105-106. Thus, the method used by Video for Windows has 60 
proven to be inadequate, i.e., the video and audio portions 
still fall out of sync or exhibit "jerky" movement during a 
presentation. 

Shortcomings inherent in decoupled audio multimedia 
systems have been a problem for some time, and various 65 
efforts have been made to synchronize the audio and video 
portions of a presentation. There has been a recognized need 
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in the industry for a solution to this problem. However, no 
satisfactory solution has been found, prior to the present 
invention. 

Therefore, a method and apparatus is desired which 
provides improved synchronization between digital audio 
and digital video data streams in a multimedia computer 
system, i.e., a method is needed to assure that corresponding 
video and audio frames are played back together. A syn- 
chronization method is also desired that does not require the 
use of an encoding procedure prior to the processing of 
audio and video digital signals. It is also desirable to provide 
a multimedia synchronization system that is capable of 
functioning consistently whether video and audio data are 
delivered to the system in separate files or interleaved in one 
file. 

SUMMARY OF THE INVENTION 

ITie present invention comprises a method and apparatus 
for synchronizing separate audio and video data streams in 
a inultimedia system. The preferred embodiment of the 
invention utilizes a nonlinear feedback method for data 
synchronization that is independent of hardware, the oper- 
ating system and the video and audio drivers used. The 
system and method of the present invention does not require 
that incoming data be time stamped, or that any timing 
information exist in the video data stream relative to audio 
and videq^data correspondence. Furmer,^fie^data isTnot 
required to be modified in any way prior to the transfer of 
data to the video and audio drivers, and no synchronization 
information need be present in the separated audio and video 
data streams that are being synchronized by the system and 
method of the present invention. The preferred embodiment 
of the present invention requires that there be a common ^ 
starting point for the audio and video data, ie., that them be 
a time index of zero where the audio and video are both in 
synchrony, such that the first byte of audio and video digital 
data are generated simultaneously. 

The synchronization method of the present invention is 
called periodically during a multimedia display to synctao3 
riz^the-video and audio datarstrcams. In the preferred 
embodiment, a periodic timer is seTto interrupt the multi- 
media operating system at uniform intervals during a mul- 
timedia display and direct the operating system to invoke the 
synchronization method of the present invention. When the 
synchronization method is invoked, the method first queries 
the video driver to determine the current video frame 
pori^n a^Uffi^^uMesrthe audio driver to determme**me 
current mJSio^po^uj^ The current audio position is then 
used to compute the equivalent audio frame number. The 
synchronization method compares the vide^and audio 
frame positions and computes a synchronization error value, 
which is essentially the number of frames by which the 
video frame position is in front of or behind the current 
audio frame position. 

The synchronization error is used to assign a tempo value 
meaningful to either the video driver or the audio driver. In 
the preferred embodiment, ^ method adjusts the video 
tempo to maingu^^d^ syn^hr bSSalSh^blrt in an alternate 
embod^nrthe mmbliSdi^ 

synchronization. Once a video tempo value has been 
determmedfuie preferred method adjusts this video tempo 
value by applying a smoothing function, le., a weighted 
average of prior tempo values, to the determined tempo 
value. If the synchronization error is deterrnined to be 
greater than a defined tolerance, i,e., if the audio and video 
data streams are more than a certain number of flames out of 
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sync, and if the tempo value is not equal to the last tempo of. the present invention is shown. It is noted that FIG. 1 

value previously sent to the video driver, then the method illustrates only portions of a functioning computer system, 

adjusts the video frame speed by passing the tempo value to and those elements not necessary to the understanding of the 

the video driver. operation of the present invention have been omitted for 

Iftl»esynctaoni2ationeiioris aH ff oxiir^y O ) i.e.< i me i > simphdty. As shown, the inultimedia computer system 

audio and video data *c&a&*&mwfofe «* J? 1 ^ h ?J^. ^ 

! OTUU V'~ i«o»Bi» »v u« ti w DUS 106 is coupled to an expansion bus 112 by means of a 

was not toe nominal rate,^, the rate intended to exactly busconlroUer fio. ^^^00 bus may be any of various 

Mthe^j^temMv^iY^I^^ types including the AT (advanced technology) bus or indus- 

at * enomin * lr!lte to the video dnver .In other w oris rfthe 10 £ chitecture ^ b mc hs a (extended 

audio and video data streams are m sync, a tempo value at ^ stan dard architecture) bus, a microchannel (MCA) 

the nominal rate is passed to the video driver. This removes , ' . .. . ... \ . \ .j 

r~ . " v * ~7 " 'T . bus, etc. A video card or video adapter such as a VGA (video 

any affects offce smoothing function, which otherwise ^y) ^ m is coupled "to the expansion bus 112 

^i^m^M^P^-^ qther than the nominal ^ is adapted to interface to a video monitor 122, as shown. 

The computer system may also include a video accelerator 

The method also determines if the audio is too for ahead card 12 A for performing compression/decompression 

of the video and if the audio is playing. If so, the audio is (codec) However, in the preferred embodiment 

paused to allow the video to catch up. If the method ±t ^em does not mdn6t a video accelerator 

determines that the audio is paused and that the video has audio ^ or sound ^ ^ is ^ coup i c d t0 ftc 

caught up, the method restarts the audio. The method saves 20 cxpansion bus 112 and interfaces to a speaker 132. The audio 

mevideo tempo valueforcon^ board 130 is preferentially a Sound Blaster H brand card 

the periodic timer and surrenders control to the operating by Q cat i ve Labs, Inc. of Milpitas, Calif, 

system until called again. Various mass storage devices are also coupled to the 

Therefore, the present invention provides an improved ^ expansion bus 112, preferably including a CD-ROM 140, 

method of synchronizing the audio and video data streams an d a hard drive 142, as well as others. One or more of these 

during a multimedia presentation to provide a correctly storage devices store video and audio data which is 

synchronized presentation. The present invention permits used duimg presentation of a multimedia display. The audio 

the use of existing software drivers and rnultimedia operat- and video data may be stored in any of a number of formats 

ing systems. Further, the method of the present invention 3Q an d j^y De stored on different media. Further, the audio and 

operates independenfly of where the audio and video data ^ ^ ^ store 4 on me( ii a located in other computer 

are stored as well as the type of operating system or drivers systems that are connected to the computer system via a 

bemgusedThusmerresentinventionoperatesregardless of network. Thus the present invention can operate in a dis- 

whether the audio and video data are interleaved in one file, tributed environment 

stored on different media, or stored in separate computer ft is notcd &at a multimcd ia computer system according 

systems. Also, the present invention t does not require any to me nt invcntion My ^ configured in any of a 

type of tune stamping or tagging . of data, and thus does not numbcr of ways . For examplC( mc video and audio card 120 

reqiure any modification of the video or audio data. Further ^ 130 , as well as one or more of the mass storage devices 

the present invention operates regardless of the type of 140 or 142 ^ lcd t0 a ^ bcal bus such as ^ 

compression/decompression algorithm used on the video ^ pQ compact interconnect) bus, or the VESA 

(Video Enhanced Standards Association) local bus, as 

BRIEF DESCRIPTION OF THE DRAWINGS desired- Various other computer configurations are ' also 

contemplated, such as a distributed system. 

A better understanding of the present invention can be ^ ^ embodmle nt of the invention, the multi- 
obtained when the following detailed description of the 45 media computer system mus trated in FIG. 1 operates using 
preferred embodiment is considered in conjunction with the ^ Wm ^ s 3 t operating System from Microsoft Corpo- 
following drawings, in which: ration of Redmond, Washington. The computer system also 

FIG. 1 illustrates a block diagram of a multimedia com- preferably includes the Microsoft Windows multimedia 

puter system accordmg to one enmodiment of the invention; extension software, including Microsoft's Media Control 

FIG. 2 is a block diagram illustrating a typical multimedia 50 Interface and associated drivers. The Windows Media Con- 
control system; trol Interface (MO) is a set of high-level commands that 

FIG. 3 is a block diagram illustrating a prior art multi- provide a device-independent interface for controlling mul- 

media software architecture; timedia devices and media files. The MCI command set is 

FIG. 4 is a block diagram illustrating a multimedia designed to provide a generic core set of commands to 

software architecture incorporating the synchronization 55 control different types of media devices. Because of the high 

method of the present invention; and level of device independence provided by the MCI com- 

FIGS. 5A-B are flowchart diagrams illustrating operation 111311(1 set ' a Programmer can use MCI commands rather than 

of the synchronization method of the present invention. * f ow lev ? * ?»» multimedia capabilities of 

HG. 6 illustrates audio and video data streams having a ™ s * 0 ^^ 

common starting point. ' 60 0P«*« s V stems *"* multimedia software, as 

r desired. 

DETAILED DESCRIPTION OF THE The computer system includes video and audio drivers 

PREFERRED EMBODIMENT which interface to video and audio hardware, respectively, 

. . ^ ^ _ ^ The audio driver interfaces between the multimedia operat- 

Multnnedia Computer System ^ ^ system and ^ audio ^ J30 ^ ^ 

Referring now to FIG. 1, a block diagram illustrating a faces between the operating system and the video accelera- 
multimedia computer system according to one embodiment tor card, if any. In the preferred embodiment, which does not 
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include a video accelerator 124, the video driver does not and controlling the tempo and pausing of playback devices, 
actually interface to any video hardware, but rather performs In this embodiment the synchronization method of the 
various video data processing on the main CPU 102. In the present invention makes calls directly to the API of the aucho 
preferred embodiment, the video driver is comprised in the and video drivers trTobtairi the necessary information and 
Intel Audio-Visual Kernel (AVK). The audio driver is pref- 5 control the tempo of one of the respective data streams to 
erably an MQ compliant WAV driver corresponding to the maintain synchronization. The preferred embodiment of the 
respective audio card 130. invcnti61Tuses~the~MCI interface to access the respective 
The computer system also includes a synchronization multimedia device drivers to query the drivers as well as 
method according to the present invention which synchro- adjust the tempo of the data streams. In alterative embodi- 
nizes the audio, and video data streams during a multimedia 10 ments of the invention, the synchronization method uses 
presentation to ensure that the appropriate sounds are gen- MQ commands to interface to the audio driver and uses the 
erated by the speaker 132 when me corresponding images dtoct API to access the video driver, or vice versa, 
are being displayed by the video monitor 122. Multimedia storage devices can be classified as either 
w , , 4 , ^ simple devices or compound devices. Simple devices do not 
Multimedia Software Architecture of the Preferred is require a data file for playback, and videodisc players and 
Embodiment CD audio players are examples of simple devices. Corn- 
Referring now to FIG. 2, the Microsoft Windows* Mui- pound devices require a data file for playback and examples 
timedia Extensions Software Architecture is illustrated. As of^compound devices dnclude digital video players and 
shown in FIG. 2, a multimedia application 200 directs a waveform audio players. The preferred embodiment of the 
computer system to present a multimedia display by inter- 20 i nvcnQOn is usc ^ to synchronize audio and video data 
facing to the hardware through the operating system and streams from compound devices, 
various device driver layers. The block 210 includes the 

\Wndows Kernel and Graphics Device Interface (GDI), i.e., ^ Multimedia System 

the bulk of the Windows operating system. As shown, the Referring now to FIG. 3, a prior art multimedia system 

mulujr^appfication200mterfacesthrou^meWmdows that does not include the synchronizing method of the 

operating system 210 to Windows device drivers 220. The present Invention is shown. As shown in FIG. 3, audio and 

device drivers 220 interface to the various elements in the video data are stored in a multimedia system using one of a 

computer system including the printer, hard drive 142, video variety of storage devices 301, 302, and 303, including the 

monitor 122, etc. ^ hard disk 142 or CD-ROM 140. If the audio and video data 

The block 240 comprises the Windows Multimedia are stored in the audio-video interleave (AVI) file format, 

Extensions Software. This translation layer isolates applica- then the audio and video data are interleaved together on the 

tions from device drivers and centralizes device- storage media in the same file. Alternatively, the audio and 

independent code. The translation layer 240 translates a video data are stored on different storage media, perhaps on 

multimedia function call into a set of Media Control Inter- 35 different computer systems connected via a network. As 

face (**MCr) calls which interface to Media Control Inter- previously noted, the present invention operates regardless 

face drivers 250. As shown, the Media Control Interface of whether the audio and video data are interleaved together, 

drivers interface to mass storage devices 260 and 270 such stored on separate media, or stored on separate computer 

as the CD-ROM 140 or hard drive 142. The Media Control systems. 

Interface layer also communicates with MCI compliant ^ The audio and video data are provided to the CPU 102, 

multimedia device drivers 230 using a set of low level w hich is preferably executing the Microsoft Windows Oper- 

functions, as shown. The multimedia hardware device driv- ating System 307, as well as the Microsoft Multimedia 

ers230dkecttycontrolamulttoediadevicesuchasanaudio Extension software. The Multimedia Extension software 

card 130 or video accelerator 124. For more information on invokes the media control interface (MO) layer software 

the Media Control Interface layer 240, please see Discover 45 309, which in turn invokes respective MCI digital drivers 

Windows 3.1 Multimedia by Roger Jennings (Que Corp. 311 to provide the respective data streams to the respective 

1992) chapters 23 and 24, which is hereby incorporated by hardware subsystems. As shown, the MQ digital driver 311 

reference. Please also see generally the Microsoft Multime- communicates with the video driver in the Intel audio-video 

dia Programmer's Reference and the Microsoft Windows kernel (AVK) 341. The video driver in the AVK performs 

Multimedia Programmer's Workbook, available from 50 various processing on the video data and interfaces to video 

Microsoft Corporation, which are both hereby incorporated accelerator hardware 124, if any. The video driver in the 

by reference. AVK 341 also monitors the frame number of the video frame 

The recording and presentation of multimedia displays is being played. The video data is then provided to the video 

handled by the Windows Multimedia Extension Software in frame buffer 345 in the video adaptor (VGA card) 120 where 

conjunction with the individual MCI drivers. Currently, not 55 it is then displayed on the video monitor 122. The MCI 

all multimedia drivers support MCI commands and com- digital driver 311 also communicates with the audio driver 

mand options. In particular, optional commands used by 331 to provide the audio data 313 to the respective hardware 

some devices are not supported. An example of an option audio card 130 to the speaker 132. The audio driver 331 is 

command is "set," which sets the operating state of a device. preferably an MQ-cornpliant driver. Thus, audio data is read 

Such a command would support the option "tempo 200" for go from the respective storage device and provided to the audio 

example, as a means of controlling the speed of the playback card 130 by the MCI driver 311 and the audio drivers 331 

device. Another example of an option command is the executing on the CPU 102. Hie audio driver 331 monitors 

<4 pause'* command, which suspends operation of a playback the number of bytes of audio data processed from the start 

device, but leaves the device ready to resume playing of a particular set of data. 

immediately. 65 As discussed in the background section, prior art multi- 
One embodiment of the present invention avoids use of media systems provide generally unsynchronized audio and 
MQ commands for querying the audio and video drivers video outputs during a multimedia presentation due to the 
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inherent difficulty of synchronizing separate audio and video 
data streams passing through separate audio and video 
subsystems and controlled by separate audio and video 
drivers. Numerous factors can affect the playback of the 
audio and video data streams, including the greater amount 
and more variable amount of processing required for the 
video data as well as other demands that can be made on the 
system. 

As discussed above, a large amount of processing may be 
required before the video data can be displayed on the video 
monitor 122. For example, if the video's color depth is 
higher than the display's, as when an AVI file is played with 
16-bit video on an 8-bit display, colors must be dithered to 
fit within the display's color restrictions. Also, if the play- 
back window size differs from the resolution at which the 
video was captured, each frame must be scaled. Video and 
audio data processing can be adversely affected by a number 
of other factors, including a slow hard disk, a slow CD-ROM 
controller, and a slow display controller or audio card. 
During this process, it becomes virtually impossible for the 
environment to maintain the audio and video data properly 
synchronized while managing other critical tasks. As dis- 
cussed in the background section, prior art methods, to the 
extent there are any, have proved inadequate in maintainin g 
synchronization between video and audio data streams. 

Multimedia Software Architecture Including the 
Preferred Embodiment of the Present Invention 

Referring now to FIG. 4, operation of the preferred 
embodiment of the present invention is illustrated. Logical 
blocks in FIG. 4 that are similar ^ to^ose^c^^inJF^^are 
designated with the same^reference "numeral for conve- 
nience. As discussed above, the preferred embodiment uses 
an MCI compatible interface to perform the synchronization 
method of the present invention. The preferred embodiment 
also uses the video driver in the Intel Audio 'Video Kernel 
(AVK) 341. Because of the high level of device indepen- 
dence provided by the MCI interface, the multimedia capa- 
bilities of Microsoft Windows can be accessed through MCI 
protocols, rather than through low-level Application Pro- 
gram Interfaces (API). The preferred embodiment uses the 
MCI protocol and commands to interface with the video and 
audio drivers. These protocols are found in the Microsoft 
Windows Software Development Kit, Multimedia Program- 
mer's Guide, Document Number PC30253-0492 which can 
be obtained from Microsoft Corporation of Redmond, Wash- 
ington, (facsimile number 206-936-7329). Ib an alternative 
embodiment of the present invention, as mentioned above, 
the method of the present invention avoids the MCI layer 
and instead communicates directly with the API of the video 
and audio drivers. Bypassing the MCI interface layer results 
in a slight speed increase due to the decreased overhead. The 
source code listing following this description implements an 
embodiment of the invention that bypasses the MCI layer 
and instead communicates directly with the API of the audio 
and video drivers. 

As shown in FIG. 4, video data and commands pass from 
the MCI digital driver 311 to the video driver in the AVK 341 
via the data path 315. The AVK in turn provides data to the 
video hardware 124, if any, which in turn provides the video 
data to the video frame buffer 345 in the video adaptor 120. 

Audio data and commands are transferred from the MCI 
digital driver 311 to the audio driver 331 via data path 313. 
The audio driver 331 in turn provides the data to the audio 
card 130 via data path 333, and the speaker 132 produces 
sound corresponding to the audio data. 
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The synchronization method of the present invention is 
performed by synchronization module block 421. The syn- 
chronization module 421 comprises a method that is pref- 
erably implemented in software, and a source code listing of 

5 one embodiment of this method is located at the end of this 
specification. As noted above, the source code listing at the 
end of this specification operates by interfacing directly to 
the API of the audio and video drivers rather than going 
through the MCI interface layer. Otherwise the source code 

10 listing is similar to the preferred method. The MCI digital 
driver 311 provides a signal to the synchronization module 
421 over path 417. The computer system includes a timer 
which periodically interrupts the MCE layer 309 and MCI 
driver 311 and directs the Md layer 309 to invoke the 

15 synchronization module 421 of the present invention. When 
the synchronization module 421 is invoked, the synchroni- 
zation method queries the audio and video drivers 331 and 
341 for the current position of the audio and video data. The 
synchronization module 421 is shown connected to the AVK 

20 video driver 341 and the audio driver 331. As noted above, 
the synchronization module 421 of the preferred embodi- 
ment of the invention interfaces to the AVK video driver 341 
and the audio driver 331 through the MCI interface layer. In 
contrast, the source code listing at the end of this specifi- 

25 cation implements an embodiment that accesses the AVK 
video driver 341 and audio driver 331 directly via the API 
of the respective drivers. 

The audio driver 331 provides audio position information 
to the synchronization module 421 over path 429. The AVK 

30 video driver 341 provides video frame position data over 
signal path 423 to the synchronization module 421. The 
synchronization module 421 uses the audio and video frame 
rate information to compute a video tempo value that is 
provided to the AVK video driver 341. Video tempo and 

35 pause commands are conveyed from the synchronization 
module to the AVK 341 over signal path 425, preferably 
routed through the MCI layer 309 as discussed above. Abo, 
in the preferred embodiment, the synchronization module 
421 provides a pause command to the audio driver 331. In 

40 an alternate embodiment, the present invention maintains 
synchronization by adjusting^ui^^^ JempoV and the 
synchronization module 42i generates anTudio playback 
tempo command that is conveyed to the audio driver 331 
over signal path 427. 

45 

Synchronization Method— Flowchart 

Referring now to FIGS. 5A and 5B, a flowchart diagram 
illustrating operation of the synchronization method per- 
formed by the synchronization module 421 according to the 

so present invention is shown. The main portion of the syn- 
chronization method is located at lines 151-327 of the 
source code listing at the end of this specification. The 
synchronization method makes a function call to a function 
referred to as audframe_decupl__aud_dev, which computes 

55 the audio frame number from the audio position. This 
function is located generally at Hnes 1-150 of the source 
code listing. 

When the synchronization method is invoked, in step 502 
the method determines if this is the first time the method has 

60 been invoked. If so, then in step 504 the method calls the 
audio driver to obtain the wave rate, ie., how many kilohertz 
at which the audio is operating. In step 506 the method 
calculates and stores the number of bytes that are in an audio 
frame that is equivalent to a corresponding video frame. The 

65 method obtains the wave rate at which the audio is playing, 
determines how many bytes of audio are played each 
second, then calculates the equivalent number of bytes for 
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each video frame using the known video frame rate. Hie In step 518 the method determines if the audio is too far 

equation used in this calculation is: ahead of the video and if the audio is still playing. In the 

preferred embodiment the method determines if the audio is 

bytc^r^^OOOOC^A^A.KiStnnstO] , ™? * *™ "<* Video in Step 518 If the 

JSm&riMSm&A^ 5 au dio is determined to be too far ahead and is also playing 

in step 518, then the method stops the audio in step 520 and 

In step 508 the synchronization method initializes other then advances to step 522. If the audio is either not too far 

variables. For example, the method initializes a previous ahead, i.e., not more than 5 frames ahead, ox the audio is not / 

tempo variable to a starting value, preferably a nominal playing, then operation advances directly to step 522. It is 1 
value. As discussed further below, this previous tempo 10 noted that if the audio is determined to be too far ahead in 

variable is used to record the prior tempo variable provided step 518, i.e. more than 5 frames ahead of the video, then the 

to the video driver the last time the synchronization module synchronization method of the present invention may not be 

was invoked. Other variables can be initialized as desired. If working, Le., the video tempo is not being set properly, 

the synchronization module is not being called for the first Another possibility is that the synchronization method is not 
time in step 502, then operation proceeds directly to step 15 being called often enough. It is noted that if the video 

510. advances too far ahead of the audio, then the video is simply 

It is noted that the preferred embodiment of the invention slowed down using a lower video tempo, 

operates as shown in FIG. 5A in steps 502-508. However, in In step 522 the method determines if the audio is paused 

an alternate embodiment, steps 502-508 are performed and the video has caught up to the audio. In the preferred 
elsewhere, such as in the audio driver 331 or the MCI layer 20 embodiment, the video is considered to have caught up to 

309. Jt is also noted that steps 502-508 are not included in the audio if the audio is less than 2 frames ahead of the 

the source code listing at the end of this specification. video. If the audio is paused and the video is determined to 

In step 510 the method determines the current video frame have caught up to the audio, then the audio is restarted in 

number. The synchronization method of the present inven- step 524. Operation then advances to step 526 (FIG. 5B). If 

tion calls the respective video driver 341 controlling the 25 either the audio is not paused or the video has not caught up 

video hardware (if any) to determine what video frame to the audio, then operation proceeds directly to step 526. 

number is currently being played. This call preferably uses In step 526 the method selects a synchronization adjust- 

MCI interface commands. In step 512 the method calls the ment factor, referred to as a tempo value, for the video driver 
respective audio driver to determine the current audio ^ using a lookup table. In the source code listing at the end of 

position, i.e., which audio byte is currently being played 30 this specification, me video tempo value is actually selected 

This call also preferably utilizes MQ commands. In an from a case statement As noted above, in the preferred 

alternate embodiment, the call to the video driver is made einrxnliment the synchronization method adjusts the video 

directly to the API of the video driver, and the call to the frame rate or video tempo to maintain the audio and video 

audio driver involves a call directly to the API of the .WAV data streams in sync. However, it is noted that in the present 

audio driver 331 to fctermine what audio byte is currently 35 invention either the audio or video stream rates can be 

being played. In step 514 the method then calculates the adjusted as desired. For example, the method could slow 

equivalent audio frame number being played using the audio down or speed up the video frame rate or slow down or 

frame rate value calculated and stored in step 506. As shown speed up the audio frame rate as desired to maintain the 

in the source code listing at the end ofthis specification, step respective audio and video data streams in sync. It is noted 

514 invokes a function referred to as audfrarne__decupl_ 40 that adjusting the audio data stream may be a simpler 

aud_dev. This function calculates the current audio frame procedure than adjusting the video data stream. The audio 

number using a fraction representing the number of bytes data stream was not adjusted in the preferred embodiment 

per equivalent audio frame to determine the audio frame because of concerns that the user might be able to hear the 

number. This function preferably does not use floating point audio adjustments. However, experimentation has shown 

numbers in the calculation due to the perceived unreliability 45 that adjustments to the audio data stream would generally 

of floating point numbers in some programming environ- not be detectable by the user if the synchronization method 

ments. of the present invention was invoked a sufficient number of 

Referring to FIGS. 5A and 5B and 6, the preferred times each second, 
embodiment of the present invention requires that there be In step 528 the method determines if the video has started 
a common starting point for the audio and video data. FIG. so to play. If not, then the method sets the video tempo value 
6 illustrates an audio data stream 604 and a video data to a slow value. This is done merely to begin the video 
stream 606 having a common starting point 602. Each audio portion of the presentation at a slow rate, If the video has 
data stream 604 includes audio frames 608 having audio started to play, operation proceeds directly to step 532. In 
data, and each video data stream 606 includes video frames step 532 the method adjusts the video tempo value using a 
610 having video data. There is a time index of zero where 55 smoothing or dampening function. The preferred embodi- 
the audio data stream 604 and video data stream 606 are both ment uses a smoothing formula which combines one half of 
in synchrony, such that a first byte of the audio data stream the current tempo value plus one half of the previous tempo 
604 and a first byte of the video data stream 606 are value. This smoothing function operates to prevent over- 
generated simultaneously. compensation and to add stability to the synchronization 

In step 516 the method then calculates the synchroniza- 60 method, thus allowing for smoother synchronization. The 

tion error quantity. Clearly, because audio data stream 604 adjustment performed in step 532 is similar to a damping 

and video data stream 606 have a common starting point function. 

602, calculating the synchronization error quantity essen- In step 534 the method determines if the audio is paused, 

tially involves subtracting the current video frame number If the audio is determined to not be paused in step 534, then 

from the current audio frame number to determine the 65 m step 536 me memod determines if audio data is available, 

number of frames by which the audio and video are out of If audio data is determined to not be available in step 536, 

svnc * then it is assumed that the multimedia presentation does not 
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include any audio data. In this case the method exits since method sets the last_tempo variable to the nominal rate to 

it not necessary to adjust the video playback speed or tempo prevent steps 548-552 from being performed the next time 

if there is no audio component. If audio data is determined the synchronization method is called 

to be available in step 536, then operation advances to step As noted above, steps 548-552 are performed when the 

542. 5 synchronization error is 0 but the prior tempo was not set to 

If the audio is determined to be paused in step 534, then toe nominal rate. Here it is desirable to eliminate the effects 

in step 538 the method determines if the audio is ahead of of Ae smoothing function applied in step 532. If the audio 

thevideoorif the audio status is reported as bad. If the audio ™ d ^ me ^ ue prQ " 

is determined to be equal to or behind the video in step 538, by the smoothmg function m step 532 wilh suggest 

then operation advances to step 542. If the audio is deter- 10 7° *«a™ « <> f sync because part of the 

• j * u u j £^ « j # „ -- 0 tempo value calculation is the weighted average of the 

mined to be ahead of the video in step 538 or if die audio ^ ^ ^ me ^ ^ ^ tf me 

status is reported as bad, then the method sets the video fB ^^ axt is 2ero> but a evious adjustmcnt was 

tempo to a nominal rate, ie. a rate which is calculated to re( ^ redj ^ svnchronization ^thod would report that a 

be equal to the average speed of the audio playback. The tempo adjustment is necessary. Therefore, in this instance 

video tempo is set to the nominal rate in step 540 because it 15 me IS sct to ^ nominal rate just as if the smoothing 

is not necessary to set the video rate to large (fast) tempos function had not been applied. 

if the audio is paused and the audio is ahead of the video. In Thus, this lock-down function ensures that the synchro- 
cases where the audio is paused and is ahead of the video, nization method does not overcompensate when the audio 
this typically means that the user has either clicked on the and video are in sync. In other words, this function adds 
PAUSE or STEP button during the presentation, or that the 20 stability by locking on a point where the audio and video are 
audio has been halted because the audio and video were too in sync to prevent overcompensation from occurring, i.e., 
far out of sync. primarily to prevent the smoothing function applied in step 
In step 542 the method determines if the synchronization 532 from pushing the audio and video out of sync. Without 
error calculated in step 516 is greater than a set tolerance. In this lock-down function, if the audio and video data streams 
the preferred embodiment the tolerance is set to 1 frame. 25 were m svnc » me smoothing function would cause the 
Thus the synchronization method does not adjust the video streams to fan out of sync, ie., would cause the video data 
tempo unless the audio and video are at least a certain stream t0 osdMate ******* temg ahead of or behind the 
amountoirtofsync.Ifmeauaioandvia>oareinsyncoTare au ?° ? tte ™ , , ^ , ^ Al _ 
relatively close to being in sync, then the tempo is not ^ step 554 fce m^^^ 

adjusted. TTnsavd(Uhavmgtocaumevideodrivertoadjust 30 b f J e ^ o6lc tnMt ^ synchronization method then 

the tempo and thus reduces the overhead caused by syn- completes. 

chronization. If the synchronization error is greater than the Conclusion 

tolerance in step 542, then in step 543 the method deter- Therefore, a method and apparatus for synchronizing the 

mines if the tempo value is equivalent to the last_tempo audio ^ video partioils of a multimedia display is shown. 

value, Le., the tempo value sent to the video driver the last 35 ^ method piovides superior synchronization over meth- 

time the synchronization method was executed If the cur- ^ found m me prior art 

rent tempo value jequals the Ust_tempo value, then opera- mc mcthod ^ wg of me t ^ 

uon advances tc > step 554. In this instance the video driver ^ ha$ ^ n describcd m connection with the preferred 

is already operating at this tempo, and thus tiiere is no need erabodimcnt , it h not iBtendcd t0 ^ to m r specific 

to call toe video dnver to set the tempo value to the same 40 fom ^ forth heicin but on ^ ^ it ^ mtcn ^ d t0 

value This also serves to reduce the overhead of the coyer such dteraatiYes , modifications, and equivalents, as 

synchromzauon method. If the tempo value does not equal canbcreilS onably included within the spirit and scope of the 

J ^H?T J , P 4 f' &en ?u StCP 544 f 1 ' Mention as denned by the appended claims, 

method adjusts the video frame rate using the tempo value ^ c c i a j m . 

detonuned i» step 526 and adjusted in , step S3^As noted « x A method for synchronizing aU(fio ^ ^ ^ 

above the meftod preferably adjusfc the video frame rate strcams having a starting point during a rmiltimedia 

using MQ interface calls In an alternate embodiment, as p^e^tioa, comprising the ste|s of : 

shown in the source code listing, the synchromzauon . . .7 ; _ - , x 

method passes the video frame rate or tempo value directly determining a current posiuon of the video data stream 

to the video driver, using the AvkGrpTe^o function call, so rdabve to to common stertm f P° mt; 

The video driver uses the received number to adjust the detennining a current position of the audio data stream 

video frame rate. Operation then advances to step 554. rdaUve to me cojnmon startm g P° int ; 

If the synchronization error is determined to not be greater calculating a synchronization error related to a difference 

than the tolerance in step 542, then in step 546 the method between the respective video data stream and audio 

determines if the synchronization error is 0 and if the 55 stream current positions using the current positions 

previous tempo was not set to the nominal rate. If either the of audio ^ vidco steams; 

synchronization error is not 0 or the previous tempo was set adjusting a tempo of one of the data streams based on the 

to the nominal rate, then operation advances to step 554. In synchronization error if necessary to place the audio 

this instance it is not necessary to adjust the tempo because an * video data streams in synchrony; 

the synchronization error was determined to be less than the 60 repeating the determining steps and the steps of calculat- 

tolerance in step 542. If the synchronization error is 0 and ing and adjusting during the multimedia presentation to 

the last tempo is not equal to the nominal rate, then an extra m a in ta i n the audio and video data streams in syn- 

stabilizing effect is added. This extra stabilizing effect is chrony. 

referred to as a 'lock-down" function. In step 548 the 2. The method of claim 1, further comprising: 

method sets the tempo to the nominal rate and in step 550 the 65 determining if the synchronization error is greater than a 

method adjusts the video frame rate by making the appro- tolerance value after the step of calculating and prior to 

priate MCI interface call to the video driver. In step 552 the the step of adjusting, and 



11/14/2003, EAST Version: 1.4.1 



5,642,171 



17 



wherein the step of adjusting the tempo is performed only 
if the synchronization error is greater than the tolerance 
value. 

3. The method of claim 1, further comprising: 
determining a tempo value from the synchronization error 5 

calculated in the step of calculating prior to the step of 
adjusting; and 
wherein the step of adjusting comprises adjusting the 
tempo of one of the data streams using the determined 
tempo value. 10 

4. The method of claim 3, further comprising: 
storing the determined tempo value; 

calculating a second synchronization error by repeating at 
a subsequent time the steps of determining a current 15 
position of the video data stream, determining a current 
position of the audio data stream, and calculating a 
synchronization error, 

determining a second tempo value from the second syn- 
chronization error; and ^ 

repeating the step of adjusting based on the second 
synchronization error if the second determined tempo 
value does not equal the stored determined tempo 
value. 

5. The method of claim 4, further comprising: 25 
applying a smoothing function to the second determined 

tempo value using the stored determined tempo value 
and second determined tempo value prior to the step of 
adjusting, wherein the smoothing function prevents 
overcompensating the tempo adjustment and adds sta- 30 
bility to the synchronizing method. 

6. The method of claim 5, further comprising: 
determining if the synchronization error is 0 and if the 

immediately prior tempo value was not equal to a 
nominal rate prior to the step of adjusting; and 35 
wherein the step of adjusting the tempo comprises setting 
the tempo to the nominal rate if the synchronization 
error is 0 and the immediately prior tempo value was 
not equal to a nominal rate. 

7. The method of claim 1, wherein the step of adjusting 40 
the tempo comprises adjusting the tempo of the video data 
stream, 

8. The method of claim 1, wherein the step of adjusting 
the tempo comprises adjusting the tempo of the audio data 
stream. 

9. The method of claim 1, further comprising: 
determining if the audio data stream is far ahead of the 

video data stream and if the audio is playing after the 
steps of determining respective current positions; and 
halting playback of the audio data steam if the audio data 
steam is far ahead of the video data steam and the audio 
is playing. 

10. The method of claim 9, further comprising: 
determining if the audio playback is paused and if the 55 

video data stream has approximately caught up to the 
audio data stream; and 
restarting the audio playback if the audio playback is 
paused and the video data stream has approximately 
caught up to the audio data stream 60 

11. The method of claim 1, wherein the video data stream 
is processed in a video subsystem and the audio data stream 
is processed in an audio subsystem such that the video data 
stream is decoupled from the audio data stream. 

12. The method of claim 13, wherein each video frame 65 
has a video frame number relative to the common starting 
point, wherein the step of determining the current position of 
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the video data stream comprises determining a current video 
frame number being played. 

13. The method of claim 1, wherein at any point in time 
during the multimedia presentation an audio frame and a 
video frame are being played and the step of detennining the 
current position of the audio data stream includes the steps 
of: 

detennining which audio byte is currently being played; 
obtaining a frequency at which the audio is being played; 
determining how many bytes of audio are being played 

per unit of time; 
determining a number of bytes per audio frame equivalent 

to a number of bytes per video frame: and 
determining an equivalent audio frame number being 

played relative to the common starting paint 

14. The method of claim 12, wherein the step of calcu- 
lating a synchronization error comprises calculating a dif- 
ference between the current video frame number and the 
determined equivalent audio frame number. 

15. The method of claim 1, wherein the synchronizing 
method is periodically performed each time after the expi- 
ration of a redetermined time interval. 

16. A computer system which synchronizes audio and 
video data streams, having respective audio and video data, 
during an audio-visual display, comprising: one or more 
storage devices far storing the audio and video data for the 
audio-visual display; 

a video monitor coupled to the one or more storage 
devices for generating a video display corresponding to 
the video data in the video data stream; 

a speaker coupled to one or more storage devices for 
generating sounds corresponding to the audio data in 
the audio data stream; 

one or more data paths for transmitting the audio and 
video data streams corresponding to the audio and 
video data from the one or more storage devices to the 
speaker and the video monitor, respectively; 

means coupled to the one or more storage devices, the one 
or more data paths, the video monitor and the speaker 
for obtaining a current position of the video data stream 
relative to a common starting point of the audio and 
video data streams; 

means coupled to the one or more storage devices, the one 
or more data paths, the video monitor and the speaker 
for obtaining a current position of the audio data stream 
relative to the common starting point; 

means coupled to both the means for obtaining for cal- 
culating a synchronization error related to a difference 
between the respective video data stream and audio 
data stream current positions using the current positions 
of the audio and video data streams; and 

means coupled to the one or more data paths and the 
synchronization error calculating means for adjusting a 
tempo of one of the data streams based on the synchro- 
nization error if necessary to place the audio and video 
data streams in synchrony; and 

wherein the means for obtaining, means for calculating, 
and means for adjusting repeat their respective func- 
tions during the audio-visual display to maintain the 
audio and video data streams in synchrony. 

17. The computer system of claim 16, further comprising: 
means coupled to the calculating means for detennining if 

the synchronization error is greater than a tolerance 
value; 

wherein the means for adjusting the tempo adjusts the 
tempo only if the synchronization error is greater than 
the tolerance value. 
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18. The computer system of claim 16, further comprising: and video frames, respectively, each audio and video frame 
means for determining a tempo value from the synchro- includes a respective frame number relative to the common 

nization error calculated by the means for calculating; starting point, the means for obtaining the current position of 

wherein the means for adjusting adjusts the tempo of one audio 8tiom comprising: 

of the data streams using the deterau^ed tempo value. 5 a far determining which audio byte is currently 

19. The computer system of claim 18, further comprising: bein S played; 

means coupled to the tempo value determining means for wherein said means for obtaining the current position of 

storing the determined tempo value, wherein the stored ^ audio <kta stream includes means far calculating an 

determined tempo value is used as a prior tempo value equivalent audio frame number relative to the common 

the next time the synchronizing method is performed; 10 starting point using the audio byte currently being 

meanscoupledtothctempovaluedctenniiungn^ m wdio P la * frequency, using a number of 

the storing means for determining if the determined bvtes of audio P 1 ^ P 61 * unit of ^ md usin g a 

tempo value equals the prior tempo value; number of bvtes P CT au ^<> frame equivalent to a 

wherein the means for adjusting is not executed if the 15 of ^sper video frame, 

determined tempo value equals the prior tempo value. ' » ™ e meth °t of ^ 27 > therein the mms for 

->n tu* ^rr,«,i^7cJota^ A fl;m ifi ZLuSZL^;^. calculating a synchronization error calculates a difference 

20. The computer system of claim 18, further comprising. ^ cgmA ^ w numbcr md ^ calculatcd 

means coupled to the tempo value determining means for equivalent audio frame number. 

storing the determined tempo value; 30 m mctfaod as in claim 1 wherein the common 

means coupled to the tempo value determining means and 20 starting point is a time index of zero where the audio and 

the storing means for applying a smoothing function to ^0 data streams are both in synchrony, such that a first 

the determined tempo value using a tempo value stored byte 0 f audio data and a first byte of video data are generated 

by the means for storing, wherein the smoothing rune- simultaneously. 

tion prevents the means for adjusting the tempo from 31. The method as in claim 1 wherein the multimedia 

overcompensating the tempo adjustment and adds sta- 25 presentation utilizes a computer system having a video 

bihty to the tempo adjustment. driver, and wherein the video data stream includes video 

21. The computer system of claim 20, further comprising: frames, and wherein the video data stream current position 
means coupled to the calculating means for determining if determining step comprises querying the video driver to 

the synchronization error is 0 and if the immediately determine a current video frame number being played, the 

prior tempo value was not equal to a nominal rate; and 30 method further comprising: 

wherein the means for adjusting the tempo comprises detennining a tempo value using the synchronization 

means for setting the tempo to the nominal rate if the error; 

synchronization error is 0 and the immediately prior wherein the tempo adjusting step comprises passing the 

tempo value was not equal to a nominal rate. tempo value to the video driver. 

22. The computer system of claim 16, wherein the means 35 32. The method as in claim 1 wherein the multimedia 
for adjusting the tempo adjusts the tempo of the video data presentation utilizes a computer system having an audio 
stream. driver and the audio data stream current position determin- 

23. The computer system of claim 16, wherein the means mg ste p comprises the step of 

for adjusting the tempo adjusts the tempo of the audio data ^ queiying ^ audio ^ t0 determme fee m ^ Q 

stream. data stream position. 

24. The computer system of claim 16, further comprising: 33 ^ method as m daim x wherein ^ vidco ^ 
means coupled to the one or more data paths for deter- stream and the audio data stream are respectively free from 

mining if the audio data stream is far ahead of the video time stamps, and the video data stream is free from any 

data stream and if the audio is playing; and ^ timing information relative to the audio and video data 

means for halting playback of the audio data stream is the stream correspondence, 

audio data stream is far ahead of the video data stream 34. The method as in claim 1 wherein the tempo is a rate 

and the audio is playing. of one of the data streams. 

25. The computer system of claim 24, further comprising: 35. The method as in claim 5 wherein the smoothing 
means coupled to one or more data paths for detennining 50 function combines one half of a current tempo value plus 

if the audio playback is paused and if the video data one half of the prior tempo value, 

stream has approximately caught up to the audio data 36. The computer system as in claim 16 wherein the 

stream; current position of the video data stream is represented by a 

means coupled to the one or more data paths for restarting video frame number, the computer system further compris- 

the audio playback if the audio playback is paused and 55 m 8 : 

the video data stream has approximately caught up to a video driver means, coupled to the means for obtaining 

the audio data stream. a current position of the video data stream, for passing 

26. The computer system of claim 16, wherein the one or the video frame number to the means for obtaining a 
more data paths comprises an audio data path for transmit- current position of the video data stream. 

ting the audio data stream and a video data path for trans- 60 37. The computer system as in claim 16 wherein the 

mitting the video data stream, current position of the audio data stream is represented by an 

27. The computer system of claim 28, wherein the means audio frame number, the computer system further compris- 
for obtaining the current position of the video data stream ing: 

comprises means for obtaining a current video frame num- an audio driver means, coupled to the means for obtaining 

ber. 65 a current position of the audio data stream, for passing 

28. The computer system of claim 16, wherein the audio the audio frame number to the means for obtaining a 
data stream and the video data stream include audio frames current position of the audio data stream. 
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38. The computer system as in claim 16 wherein the 40. The computer system as in claim 20 wherein the 
common starting point is a time at which a first byte of the smoothing function combines one half of a current tempo 
audio data and a first byte of the video data are in synchrony. value plus one half of the prior tempo value. 

39. The computer system as in claim 16 wherein the 

tempo is a rate of one of the data streams. * * * * * 
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[57] ABSTRACT 

A computer-based media data processor for controlling 
transmission of digitized media data in a packet switching 
network. When the processor receives a request from a 
network client node for presentation of specified media data 
stream presentation unit sequences the processor in response 
retrieves media data from a corresponding media access 
location, determines the media data type of each presenta- 
tion unit in the retrieved media data, and designates each 
retrieved presentation unit to a specific media data presen- 
tation unit sequence based on the media data type determi- 
nation for that presentation unit The processor then 
assembles a sequence of presentation descriptors for each of 
the specific presentation unit sequences, all presentation 
descriptors in an assembled sequence being of a common 
media data type, and then assembles transmission presenta- 
tion unit packets each composed of at least a portion of a 
presentation descriptor and its media data, all presentation 
descriptors and media data in an assembled packet being of 
a common media data type. The assembled packets are then 
released for transmission via the network to the client 
processing node requesting presentation of the specified 
presentation unit sequences. 

105 Claims, 12 Drawing Sheets 
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DIGITAL MEDIA DATA STREAM NETWORK 
MANAGEMENT SYSTEM 

BACKGROUND OF THE INVENTION 

This invention relates to the management of digitized 
media stream data, e.g.. digitized video, and particularly 
relates to the capture, storage, distribution, access and pre- 
sentation of digital video within a network computing envi- 
ronment 

Extensive technological advances in microelectronics and 
digital computing systems have enabled digitization of a 
wide range of types of information; for example, digital 
representations of text graphics, still images and audio are 
now in widespread use. Advances in compression, storage, 
transmission, processing and display technologies have 
recently provided the capabilities required to extend the field 
of digitization to additionally include video information. 

Conventionally, digitized audio and video are presented 
on. for example, a computer system or network by capturing 
and storing the audio and video streams in an interleaved 
fashion, i.e.. segments of the two streams are interleaved. 
This requires storage of the digital audio and video in a 
single stream storage container, and further requires retriev- 
ing chunks of interleaved audio and video data at an aggre- 
gate rate which matches the nominal rate of an active 
presentation sequence. In this way, one unit of video (say, a 
frame) is physically associated in storage with one unit of 
audio (say, a corresponding 33 msec clip), and the two are 
retrieved from storage as a unit. Sequences of such audio 
and video units are then provided to a presentation and 
decoder digital subsystem in an alternating fashion, whereby 
each audio and video unit of a pair is provided in sequence. 

Computer systems that provide this audio and video 
management functionality typically include digital 
compression/decompression and capture/presentation hard- 
ware and software, and digital management system 
software, all of which is based upon and depends upon the 
interleaved format of the audio and video streams it pro- 
cesses. 

Currently, handling of audio and video in a network 
environment is also based on a scheme in which capture, 
storage, and transmission of audio and video must be carried 
out using interleaved audio and video streams. This inter- 
leaving extends to the transmission of audio and video 
streams across the network in an interleaved format within 
transmission packets. 

Synchronization of audio with video during an active 
presentation sequence is conventionally achieved by ini- 
tially interleaving the audio and video streams in storage and 
then presenting audio and video chunks at the nominal rate 
specified for an active presentation sequence. 

In 'Time Capsules: An Abstraction for Access to 
continuous-Media Data." by Herrtwich. there is disclosed a 
frame-work based on time capsules to describe how timed 
data shall be stored, exchanged, and accessed in real-time 
systems. When data is stored into such a time capsule, a time 
stamp and a duration value are associated with the data item. 
The time capsule abstraction includes the notion of a dock 
for ensuring periodic data access that is typical for 
continuous-media applications. By modifying the param- 
eters of a dock, presentation effects such as time lapses or 
slow motion may be achieved. 

While the Herrtwich disclosure provides a time capsule 
abstraction for managing time-based data, the disclosure 
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does not provide any technique for synchronizing time- 
based data based on the time capsule abstraction, and does 
not address the requirements of time-based data manage- 
ment in a network environment. Furthermore, the disclosure 
3 does not address processing of time-based data streams as a 
function of their interleaved format or manipulation of that 
format 

SUMMARY OF THE INVENTION 

10 In general, in one aspect, the invention features a 
computer-based media data processor for controlling the 
computer presentation of digitized continuous time-based 
media data composed of a sequence of presentation units. 
Each presentation unit is characterized by a prespecified 

15 presentation duration and presentation time during a com- 
puter presentation of the media data and is further charac- 
terized as a distinct media data type. In the processor of die 
invention, a media data input manager retrieves media data 
from a computer storage location in response to a request for 

20 computer presentation of specified presentation unit 
sequences, and determines the media data type of each 
presentation unit in the retrieved media data. The input 
manager then designates each retrieved presentation unit to 
a specified media data presentation unit sequence based on 

25 the media data type determination for that presentation unit 
The input manager then assembles a sequence of presenta- 
tion descriptors for each of the specified presentation unit 
sequences, each descriptor comprising media data for one 
designated presentation unit in that sequence, and each 

30 sequence of presentation descriptors being of a common 
media data type; and then associates each presentation 
descriptor with a corresponding presentation duration and 
presentation time, based on the retrieved media data. Finally, 
the input manager links the presentation descriptors of each 

35 sequence to establish a progression of presentation units in 
that sequence. 

A media data interpreter of the invention indicates a start 
time of presentation processing of the presentation descrip- 
tor sequences, and accordingly, maintains a current presen- 

40 tation time as the sequences are processed for presentation. 
The interpreter counts each presentation unit in the media 
data sequences after that unit is processed for presentation, 
to maintain a distinct current presentation unit count for each 
sequence, and compares for each of the presentation unit 

45 sequences a product of the presentation unit duration and the 
current presentation unit count of that sequence with the 
current presentation time after each presentation unit from 
that sequence is processed far presentation. Based on the 
comparison, the interpreter releases a presentation unit next 

50 in that presentation unit sequence to be processed for 
presentation when the product matches the current presen- 
tation time count, and deletes a presentation unit next in that 
presentation unit sequence when the product exceeds the 
current presentation time count 

55 In general, in another aspect the invention features a 
media data processor for controlling transmission of digi- 
tized media data in a packet switching network Such a 
network comprises a plurality of client computer processing 
nodes interconnected via packet-based data distribution 

60 channels. In the invention, a remote media data controller 
receives from a client processing node a request for presen- 
tation of specified presentation unit sequences, and in 
response to the request retrieves media data from a corre- 
sponding media access location. A remote media data input 

65 manager of the invention then determines the media data 
type of each presentation unit in the retrieved media data, 
and designates each retrieved presentation unit to a specified 
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media data presentation unit sequence based on the media responding to that descriptor with the currently maintained 
data type determination for that presentation unit. Then the presentation time. Based on mis comparison, the interpreter 
input manager assembles a sequence of presentation descrip- releases a next sequential presentation unit to be processed 
tors for each of the specified presentation unit sequences. for presentation when the corresponding presentation time 
each descriptor comprising media data for one designated 5 of that descriptor matches the current presentation time, and 
presentation unit in mat sequence, and all presentation deletes a next sequential presentation unit to be processed 
descriptors in an assembled sequence being of a common for presentation when the current presentation time exceeds 
media data type. The interpreter associates each presentation the corresponding presentation time of that descriptor, 
descriptor with a corresponding presentation duration and in other preferred embodiments, synchronization of pre- 
presentation time, based on the retrieved media data; and 10 sentation of the specific presentation unit sequences is 
finally, links the descriptors in each assembled sequence to accomplished by the local media data interpreter by count- 
establish a progression of presentation units in each of the ing each presentation descriptor in the sequences after that 
specified presentation unit sequences. presentation unit is released to be processed for presentation. 

A remote network media data manager of the invention to maintain a distinct current presentation unit count for each 

assembles transmission presentation unit packets each com- 15 sequence. Then, the interpreter compares for each of the 

posed of at least a portion of a presentation descriptor and its presentation unit sequences a product of the presentation 

media data, all presentation descriptors and media data in an unit duration and the current presentation descriptor count of 

assembled packet being of a common media data type; and that sequence with the currently maintained presentation 

releases the assembled packets for transmission via the time after a presentation unit from that sequence is released 

network to the client processing node requesting prcsenta- 20 to be processed for presentation. Based on the comparison, 

uon of the specified presentation unit sequences. the interpreter releases a next sequential presentation unit in 

A local media data controller of the invention transmits wat presentation unit sequence when the product matches 

the presentation unit sequence request to the remote media foe currently maintained presentation time, and deletes a 

data controller from the client processing node, and controls next sequential presentation unit in that presentation unit 

starting and stopping of sequence presentation in response to 25 sequence when the product exceeds the currently maintained 

user specifications. presentation time. 

A local network media data manager of the invention In other preferred embodiments, the remote media data 

receives at the client processing node the transmission controller of the invention receives from the local media 

presentation unit packets via the network, and designates a data controller, via the network, an indication of a specified 

presentation unit sequence for each presentation descriptor presentation data rate at which the specified presentation 

and its media data in the received packets to thereby "nit sequences are to be transmitted via the network to the 

assemble the presentation descriptor sequences each corre- client node. The media data retrieved comprises a plurality 

sponding to one specified presentation unit sequence, all of storage presentation unit sequences stored in a computer 

presentation descriptors in an assembled sequence being of storage location, each storage presentation unit sequence 

a common media data type. Then the local network media composed of presentation units corresponding to a specified 

data manager links the descriptors in each assembled presentation unit sequence and all presentation units in a 

sequence to establish a progression of presentation units for storage presentation unit sequence being of a common 

each of the presentation unit sequences. media data type. The remote media data input manager 

Alocairnediadatomtcrpret^^^ 40 designates each of aportionof me i>resentation unit descrip- 

assembled presentation descriptor sequences one descriptor ton « * e descriptor sequences are^sembled, the ^portion 

at a time and releases the sequences for presentation one "eluding a number of descriptors based on the specified 

presentation unit at a time. In this process, the local inter- presentation data rate, each designated descriptor cornpns- 

preter indicates a start time of presentation processing of the *g nuU ™dia data, to hereby compose the presentation 

sequences, and accordingly, maintains a current presentation 45 descriptor sequences wifconly a r^on of starve F esen- 

time as the descriptor sequences are processed for presen- *tion ^ Wth ^ m *° I \ the s P c ** ed 

tation. Based on the presentation duration of each presen- presentation unit sequences attain the specified presentation 

tation unit, the interpreter synchronizes presentation of the data rate of transmission. 

specified presentation unit sequences with the current pre- In the invention, the separation of media streams and 

sentation time, 50 distinctly formatting of network transmission packets for 

In preferred embodiments, the specified media data pre- each stream provides an cpoortunity and the facility to 

sentation unit sequences comprise a video frame sequence examine, process, and make transmission decisions about 

including a plurality of intracoded video frames; preferably. each stream and each presentation unit independent of other 

each frame of the video frame sequence comprises an streams and presentation units. As a result the media 

intracoded video frame, and more preferably, the video 55 processor of the invention can make presentation decisions 

frame sequence comprises a motion JPEG video sequence about a P ve n presentation unit independent of the other 

and an audio sequence. In other preferred embodiments, "*ts in the corresponding stream, and can make those 

each of the plurality of intracoded video frames comprises a decisions "on-the-fly". This capability provides for real time 

key frame and is followed by a plurality of corresponding scaU nS and network load adjustment as a stream is retrieved, 

non-key frames, each key frame including media data infor- so P r0ccsscd < transmitted across the network, 

■nation required for presentation of the following corre- Further aspects, features, and advantages of the invention 

sponding non-key frames. arc set forth in the following specification and the claims. 

In other preferred embodiments, synchronization of pre- DESCRIPTION OF THE DRAWING 
sentation of the specific presentation unit sequences is 

accomplished by the local media data interpreter by com- 65 FIG. 1 is a schematic diagram of media stream access and 

paring for each of the presentation descriptors in each of the delivery points with which the digital video management 

presentation descriptor sequences the presentation time cor- system of the invention may interface; 
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FIG. 2 is a schematic diagram of a stand-alone imple- 
mentation of the digital video management system of the 
invention; 

FIG. 3 is a schematic diagram of a network implementa- 
tion of the digital video management system of the inven- 
tion; 

FIG. 4 is a schematic diagram of the local digital video 
management system manager modules of the invention; 

FIG. 5 is a schematic diagram illustrating the flow of 
media stream data between the stream I/O manager and 
stream interpreter modules of the local digital video man- 
agement system manager of FIG. 4; 

FIG. 6 is a schematic flow chart illustrating presentation 
and capture scenarios carried out by the local digital video 
management system manager of FIG. 4; 

FIG. 7 is a schematic illustration of the translation from 
media stream storage format to token format carried out by 
the local digital video management system manager of FIG. 
4; 

FIG. 8 is a schematic flow chart illustrating presentation 
and capture scenarios carried out by a digital video system 
used in conjunction with the local digital video management 
system manager scenarios of FIG. 6; 

FIG. 9 is a schematic diagram of the local digital video 
management system manager and the remote digital video 
management manager modules of the invention in a network 
implementation; 

FIG. 10 is a schematic diagram illustrating the flow of 
media stream data between the remote and local digital 
video management manager modules of the invention in a 
network implementation; 

FIG. 11A is a schematic flow chart illustrating presenta- 
tion and capture scenarios carried out by the remote digital 
video management system manager of FIG. 9; 

FIG. 11B is a schematic flow chart illustrating presenta- 
tion and capture scenarios carried out by the local digital 
video management system manager of FIG. 9; 

FIG. 12 is a schematic illustration of the translation of 
stream tokens of FIG. 7 into packet format 

DESCRIPTION OF A PREFERRED 
EMBODIMENT 

Referring to FIG. 1, there is illustrated the digital video 
management system (DVMS) 10 of the invention. The 
DVMS provides the ability to capture, store, transmit 
access, process and present live or stored media stream data, 
independent of its capture or storage location. In either a 
stand-alone or a network environment The DVMS accom- 
modates media stream data, i.e., continuous, high data-rate, 
real-time data, including vi^^^diQ^&imation, photo- 
graphic stills, and other types of continuous, time-based 
media data. Throughout this description, the DVMS of the 
invention will be explained with reference to audio and 
ridel} "streams, but it must be remembered that any time- 
based media data stream may be managed in the system. In 
the DVMS, as shown in FIG. 1, media data may be accessed 
from, e.g.. live analog capture, analog or digital file storage, 
or live digital capture from, e.g., a PBX (private branch 
exchange) server, among other access points. The accessed 
media is managed by the DVMS for delivery to, e.g., a 
presentation monitor, a computer system for editing and 
presentation on the computer, a VCR tape printer, or digital 
storage, or sent to a PBX server. 

Of great advantage, the DVMS management scheme is 
independent of any particular storage or compression tech- 
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nology used to digitize the data streams, and further, is 
independent of any particular communication protocols or 
delivery platform of a network in which the DVMS is 
implemented. Additionally, the DVMS is industry 

5 standards-based yet is flexible and standards-extensible, via 
its layered architecture, which incorporates multiple man- 
agement platforms. Each of these features and advantages 
will be explained in detail in the discussion to follow. 
Digital Video Management System Components 

10 The DVMS of the invention is based on a technique 
whereby media data streams are handled and managed as 
distinct and separate media data streams in which there is no 
interleaving of media data. Here the term "stream** is meant 
to represent a dynamic data type, like video, as explained 

is above, and thus, a stream consists of dynamic information 
that is to be produced and consumed in a computer system 
or network with temporal predictability. A stream contains a 
succession of sequences. Sequences can themselves contain 
sequences; in turn, each sequence contains a succession of 

20 segments. Streams, sequences and segments, as uiformation 
identifiers, have no media type- specific semantics. Rather, 
they are convenient abstractions for specifying and organiz- 
ing dynamic data types to be managed by the management 
system of the invention. An easily understood analogy to 

25 streams, sequences and segments is that of documents 
containing chapters, sections and sentences. 

Streams are characterized by their media data type, e.g., 
audio, video, or animation data types. Sequences represent 
information mat is meaningful to the user. For example, a 

30 video sequence may represent a video clip containing a 
video scene. Segments can be convenient "chunks** of data 
for editing and mixing that data. Segments may also repre- 
sent units of data that are temporally linked, as when using 
a video compression scheme that produces key video frames 

35 and corresponding following difference video frames. 

In the DVMS of the invention, streams that are intended 
for synchronous presentation can be grouped into a stream 
group of distinct constituent streams (i.e., without 
interleaving). Although constituent streams in such a stream 

40 group may be stored in an interleaved form within a storage 
container, the DVMS can dynamically coordinate separately 
stored streams; in either case, the system processes the 
streams distinctly, rather than in an interleaved fashion. 
Segments of streams contain presentation units. A pre- 

45 sentation unit is a unit of continuous, temporally-based data 
to be presented, and accordingly, has an associated presen- 
tation time and presentation duration. A presentation time 
indicates the appropriate point in the sequence of a presen- 
tation at which the associated presentation unit is to be 

50 played, relative to a time base for the ongoing presentation. 
A presentation duration indicates the appropriate interval of 
time over which the associated presentation unit is to be 
played in the ongoing presentation. Thus, a video presenta- 
tion unit comprises a video frame, and an audio presentation 

55 unit comprise a number of sound samples associated with a 
frame duration. 

As mentioned above, the DVMS may be implemented in 
a stand-alone computer system or a computer-based, packet 
switched network. Referring to FIG. 2, in a stand-alone 

60 computer system implementation 12. live or stored media 
streams are accessed and captured for presentation and 
editing on the stand-alone computer 14. The captured, and 
optionally edited media streams may then be delivered to a 
presentation monitor or to a VCR tape printer utility. 

65 Referring to FIG. 3, a packet switching network in which 
the DVMS is implemented comprises desktop computer 
systems 18 which are linked via a packet switching network 



11/14/2003, EAST Version: 1.4.1 



5,719,786 

7 8 

80, which is controlled by the DVMS network implemcn- manager to independently scheduable operating system pro- 

tation 16. The network 80 may comprise a local area cesses with independent program control flow and data 

network (LAN) or a wide area network (WAN), or a space allocation. The flow of media stream data is managed 

combination of one or mare LANs and WANs. The DVMS by the stream VO manager 26 and the stream interpreter 28, 

provides access to and capture of media streams from live 5 while the flow of control information is managed by the 

analog video capture, e.g., a VCR or camcorder, a network, stream controller 24. Each of these management functions is 

storage or PBX server, or one of the desktop computers, and explained in detail below. 

in turn manages the transmission of the media stream data The stream VO manager module 26 is responsible^ for the ^ 
across the network back to any of the access points. dynamic supply of media data streams, e.g., audioand video 
The digital video management system consists of a local 10 streams, from or to the stream interpreter. This module also 
DVMS manager and a remote DVMS manager. The local provides efficient file format handling functions for the 
DVMS manager provides a client operating environment, media data., if it is accessed via a storage file, e.g., a DVI® 
and thus resides on a stand-alone computer or each client AVSS file. In a stand-alone implementation of the DVMS of 
computer in a network, "client" here being defined as a the invention, the stream I/O manager provides retrieval and 
computer system or one of the access points in a network 15 storage of media data streams from or to points of media 
that request media data; the remote DVMS manager pro- access, such as digital or analog storage containers, while in 
vides a network operating environment, and thus resides on a network implementation of the DVMS, as described 
a network server. The local DVMS manager may be imple- below, the remote DVMS manager modules provide 
mented on, for example, IBM -compatible personal cornput- retrieval and storage at points of media access via the 
ers running Microsoft® Windows™, to thereby provide 20 network. Most importantly, the stream I/O manager per- 
high-level, industry-standard access to underlying digital forms a translation from the representation of audio and 
video services. This local DVMS manager implementation video information as that information is stored to the car- 
may support, for example, the industry- standard responding dynamic computer-based representation. This 
Microsoft®) digital video MCI API for application devel- translation is explained in detail below, 
opment. The local DVMS manager incorporates an efficient 23 The stream interpreter module 28 is responsible for man- 
data-flow subsystem, described below, that is highly por- aging the dynamic computer-based representation of audio, 
table to other operating systems. and videorasthat representation is manipulated in a stand- 
The DVMS system of the invention is preferably imple- alone computer or a computer linked into a packet network, 
mented as an application programming interface suite that This dynamic management includes synchronization of 
includes interfaces for a computer programming application 30 retrieved audio an d vid eo streams, and control of the rate at 
to include media data stream management capability within which^me^udio and jrideo ^ihYoniution-is-prcsented during 
the application. Thus, the DVMS interfaces with an under- a presentation sequenced In addition, the stream Interpreter 
lying programming application via interface calls that ini- module manages the capture, compression, decompression 
tiate media data stream functions within the realm of the and playback of audio and video information. This module 
programming application. Such an interface implementation 35 is, however, compression technology-independent and addi- 
will be understandable to those skilled in the art of C tionally is device-independent Base services of a digital 
rrogramming. video subsystem, including, for example, hardware for cap- 
Trie remote DVMS manager acts to dynamically link a ture and presentation functions, are preferably implemented 
client and a server in the packet network environment. The to be accessed through a standard API suite of digital video 
architecture of this manager has the important advantage of 40 primitives, which encapsulate any functions unique to a 
supporting the ability to scale distinct, noninterleaved media particular compression or device technology, 
data streams, as discussed in depth below. This ability to The following suite of primitive functions provide device- 
scale packet-based video, thereby creating scalable packet independent access to the base services of a digital video 
video, is a facility which permits adaptive bandwidth man- subsystem: 

agement for dynamic media data types in both LANs and 45 Open: Open a specified device, initialize it, and return a 

WANs. The remote DVMS manager may be implemented as handle for further requests; 

a Netware© Loadable Module, on, for example, the NoveU aose . aose a specified device and free up any associated 

Netware© operating system. resources; Get_Capabilitics: Query a device's 

Local DVMS Manager capabilities, e.g., display resolutions, compression 

The local DVMS manager manages the access and cap- so format, etc ' 

ture of media data streams transparently U., without ^ ^ ^ ^ from a strcam 

impacting the functionality of the application program buffer* 

which requested that access and capture. The local DVMS „ „ ' . ^ 

manager works with a digital video system, implemented St0 P : Sto P decoding and displaying data from a stream 

either in special purpose digital video hardware or in special 55 buffer; 

purpose software-based emulation of the digital hardware. Get_Info: Get information about the current status of a 

Referring to FIG. 4, the local DVMS manager 20 consists device; 

of three modules: the stream controller 24. stream input/ Set _Jnfo: Set information in the device attributes, 

output (I/O) manager 26, and the stream interpreter 28. This The stream controller module 24 is responsible for the 

modularity is exploited in the DVMS design to separate the 60 control of video and audio capture and playback functions 

flow of data in a media data streams from the flow of control during user-directed applications. This control includes 

information for that media stream through the system. Based maintaining the dynamic status of video and audio during 
on this data and control separation, streams data and stream capture or playback, and additionally, providing presenta- 

control information are each treated as producing distinct tion control functions such as play, pause, step and reverse, 

interactions among the three manager modules, which oper- 65 This module is accordingly responsible for notifying an 
ate as independent agents. The VO manager, interpreter and active application , of stream eventi^durmg audio and video 

controller agents are each mapped via the local DVMS carKufeor playback. An event is here defined as me ctufent 
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presentation unit number, for which^an«2n^cM^^would be 
made, or the occurrence ofrthe t ^^rjSn^^f a prespecilied 
presentation unit number with a current presentation unit 
number. 

During the active playback of audio and video, or other 
dynamic media data streams, the stream VO m ana ger and -tire 
streams intapretcr act as the time-based producer and 
consumer, respectively, of the data streams being played 
back. Conversely, during recording of a dynamic data 
stream, the stream interpreter acts as the tune-based stream 
producer and the streams I/O manager acts as the time-based 
stream consumer. During both playback and recording, the 
I/O manager and the interpreter operate autonomously and 
asynchronously, and all data in an active stream flows 
directly between them via a well-defined data channel 
protocol. The stream controller asynchronously sends con- 
trol messages to affect the flow of data between the I/O 
manager and the interpreter, but the controller does not itself 
participate in the flow of data. As discussed below, all data 
flow operations are handled using a minimal number of 
buffer copies between, for example, a disk or network 
subsystem and the digital video capture and presentation 
hardware. 

This system design is particularly advantageous in that it 
provides for complete transparency with respect to the 
domain of the I/O manager and the interpreter, thereby 
providing the ability to extend the system to a network 
client/server configuration, as explained below. Moreover, 
this basic three-agent unit may be concatenated or recursed 
to form more complex data and control functionality graphs. 

In the architecture of the local DVMS manager, the 
activity of one of the asynchronous agents, each time it is 
scheduled to run while participating in a stream flow, is 
represented as a process cycle. The rate at which an asyn- 
chronous agent is periodically scheduled is represented as 
the process rate far that agent, and is measured as process 
cycles per second. A process period is defined as the time 
period between process cycles. In order to maintain con- 
tinuous data flow of streams between the stream I/O man- 
ager and the stream interpreter, the limiting agent of the two 
must process a process period's worth of presentation units 
within a given process cycle. In cases in which such process 
rates are not achieved, the local DVMS manager can control 
the flow rate, as explained below. The process rate for the 
stream interpreter is close to the nominal presentation rate of 
the stream, i.e., in every process cycle, a presentation unit is 
processed. The stream I/O manager services several presen- 
tation units in every process cycle and thus, its process rate 
may be much lower than the presentation rate. 

The modularity of the stream control functions provided 
by the stream I/O manager, interpreter and controller make 
the local DVMS manager architecture of the DVMS highly 
portable to most modern computer operating systems which 
support preemptive multitasking and prioritized scheduling. 
This architecture also provides for selective off-loading of 
the stream I/O manager and interpreter modules to a dedi- 
cated coprocessor for efficient data management Most 
importantly, the highly decentralized nature of the manager 
architecture allows it to be easily adapted to LAN and WAN 
systems, as discussed below. 

Referring to FIG. 5, when a computer implemented with 
the DVMS of the invention requests access to audio or video 
streams, the following stream flow occurs. The stream I/O 
manager 26 module retrieves the requested streams from a 
stream input 50; this stream input comprises a storage access 
point, e.g., a computer file or analog video source. The 
stream I/O manager then separates the retrieved streams 
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according to the specified file format of each stream. If two 
streams, e.g., audio and video streams, which are accessed 
were interleaved in storage, the stream I/O manager dynami- 
cally separates the streams to then transform them to distinct 

5 internal representations, each comprising a descriptor which 
is defined based on their type (i.e. audio or video). Once 
separated, the audio and video stream data are handled both 
by the stream I/O manager and the stream interpreter as 
distinct constituent streams within a stream group. The 

io stream I/O manager 26 then exchanges the stream data, 
comprising sequences of presentation units, with the stream 
interpreter 28 via a separate queue of presentation units 
called a stream pipe 32. for each constituent stream; an audio 
stream pipe 33 is thus created for the audio presentation 

15 units, and a video stream pipe 31 is created for the video 
presentation units. Each audio stream (of a group of audio 
streams) has its own pipe, and each video stream has its own 
pipe. During playback of streams, the stream I/O manager 
continually retrieves and produces presentation units from 

20 storage and the stream interpreter continuously consumes 
them, via the stream pipes, and delivers them to a digital 
media data subsystem for. e.g., presentation to a user. 

When retrieving a plurality of streams from an input 30 in 
which the streams are separated (not interleaved), the stream 

25 I/O manager retrieves and queues the streams 1 data in a 
round robin fashion, but does not perform any stream 
separation function. The stream interpreter processes these 
streams in the same manner as it processes those which are 
originally interleaved. Thus, the stream I/O manager advan- 

30 tageousiy shields the remainder of the system from the 
nature of the static container 30, and further "hides" the 
format of the storage container, as well as the way mat 
logically coordinated data streams are aggregated for stor- 
age. Additionally, the details of the stream interpreter 

35 implementation, such as its hardware configuration, are 
"hidden" from the I/O subsystem; in fact the only means of 
communication between the two agents is via the well- 
defined stream pipe data conduits. 

Referring also to FIG. 6. during a presentation scenario. 

40 the stream controller 24 first initializes 36 the stream I/O 
manager 26 and stream interpreter 28, by creating active 
modules of them to begin processing streams, and then 
defines and indicates 38 a stream group and the correspond- 
ing constituent stream names. The stream I/O manager 26 

45 then retrieves 40 the named streams from corresponding 
storage containers 30 and separates the streams, if stored in 
an interleaved fashion. If they were not interleaved, the 
streams are retrieved in a round-robin fashion. Once the 
streams are retrieved, the streams I/O manager converts 42 

50 the streams to an internal computer representation of stream 
tokens, described below. Ma the stream group indication 30. 
each stream token Is identified with a stream and a stream 
group by the indication provided to the stream I/O manager 
by the stream controller. The I/O manager then buffers 44 the 

55 streams separately, each in a distinct stream pipe 32 for 
consumption by the stream interpreter 28; the stream con- 
troller provides control 46 of the steam group as it is 
enqueued. 

Referring also to FIG. 7, the I/O manager streams trans- 
60 lation 42 from storage representation to stream token rep- 
resentation is as follows. Typically, audio and video data is 
stored in an interleaved fashion on a disk and so upon 
retrieval are in an interleaved disk buffer, as in the Intel® 
AVSS file format The disk buffers 100 consist of a sequence 
65 of stream group frames 105, each frame containing a header 
106. a video frame 108, and an audio frame 110. A separate 
index table (not shown) containing the starting addresses of 
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these stream group frames is maintained at the end of a file in such an interleaved form. Once the streams are 

containing these frames. This index table permits random interleaved, if necessary, the streams are stored in a corre- 

access to specifically identified stream group frames. sponding storage container 30. 

The disk buffers are retrieved by the I/O manager from the Each of the functions of the stream controller, stream I/O 
disk in large chunks of data, the size of each retrieved chunk 5 manager, and stream interpreter described in these scenarios 
being optimized to the disk track size, eg., 64K bytes each. may be implemented in hardware or software, using stan- 
The I/O manager examines each retrieved stream group dard design techniques, as will be recognized by those 
frame header and calculates the starting addresses of each skilled in the art Appendices A. B, and C present a 
audio and video frames within the stream group frame. It pseudocode scheme for the interactions between the stream 
also retrieves the time stamp information from the corre- 10 controller, stream I/O manager, and stream interpreter in 
sponding frames. A linked list of descriptors, called tokens retrieving and presenting streams. The coding of the 
112, is then generated for the audio and video frames; each pseudocode process steps into computer instructions suit- 
token represents an audio or video presentation unit 114 and able to carry out the described scenario will be understand- 
the time stamp 116 for that unit These tokens are continu- able to one having ordinary skill in the art of C program- 
ously linked into a list representing the stream pipe. Thus, in is ming. 
the process described above, the stream I/O manager Syndttonization oi Au 

retrieves interleaved data from a disk, separates the data into As mentioned in the~presentation process described 

distinct streams, and constructs an internal representation of above, the digital video management system of the invention 

separated streams based on separate stream pipes, one for provides synchronization of audio to video, and in general, 

each stream. 20 synchronization between any two or more dynamic stream 

Once the streams are enqueued in the stream pipes, the being presented. This synchronization function is inherently 
stream interpreter 28, having been initialized 36 by the required for the coordinated presentation of multiple real- 
stream controller 24, accepts and dequeues 48 the constitu- time, continuous, high data-rate streams in a stream group, 
ent stream tokens of presentation units. The debuffered For example, the real-time nature of audio and video is 
streams are then scaled 50 and synchronized 52, based on 25 derived from the presentation attributes of these dynamic 
control via the stream controller, which maintains 54 the data types, which have quite different presentation 
status of the stream group. The scaling process will be attributes; full motion video needs to be presented as 30 
described in detail below. The synchronized streams are then frames per second and high quality audio needs to be 
delivered to the digital presentation subsystem hardware. presented at 32,000 samples per second. 

The decompression scheme is based on the particular 30 Furthermore, digital video and audio data streams have 
compression format of video frames, e.g.. the motion JPEG real-time constraints with respect to their presentation. The 
video format. This format is one of a preferred class of video streams are usually continuous and last from 30 seconds- 
formats, in which each frame is intracoded, i.e., coded long (cups) to 2 hours-long (movies). Additionally, the 
independently, without specification of other frames. streams typically consume from about 1 Mbit/sec to 4 

Referring to FIG. 8, the digital video system 120 receives 35 Mbit/sec of storage capacity and transmission bandwidth, 
streams from the stream interpreter and first decodes and depending on the particular compression technology used 
decompresses 122 the stream data, each stream being pro- for digitizing the stream. Thus, synchronization of differing 
cessed separately. The decoded and decorepressed data data streams must accommodate the diverse temporal 
streams are then stored 124 in corresponding frame buffers, aspects of the streams to be synchronized 
e.g.. video and audio frame buffers. At the appropriate time, 40 The synchronization capability of me^digMivideOiman- 
the stored data is converted 126 from its digital representa- agement systenrojjhe invention is based on seU-timing, and 
tion to a corresponding analog representation, and is deliv- accordingly, self-synchronization, of data streams to be 
cred to a playback monitor and audio speakers. The various synchronized. This technique aexxanmodates independent 
operations of the digital hardware subsystem are controlled handling of multiple data streams which arc together con- 
by the stream interpreter via digital video primitives, as 45 stituent streams of a stream group, even if the stored 
explained and described previously. representations of the constituent stream are interleaved; the 

In the reverse operation, i.e., capture and storage of digital stream I/O manager separates interleaved streams before the 

video and audio streams being processed by a computer stream interpreter synchronizes the streams. Alternatively, 

system, the stream interpreter 28 captures the audio and independent constituent streams may, however, be stored in 

video streams from the digital hardware subsystem 12#. 30 separate file containers and be synchronized, before 

Before this capture, the hardware subsystem digitizes 128 presentation, with a common reference time base, 

the audio and video signals, stores 130 the digitized signals Self-synchronization also provides the ability to prioritize 

in a buffer, and before passing the digitized streams to the one constituent stream over other streams in a stream group, 

stream interpreter, compresses and encodes 132 the video For example, an audio stream may be prioritized over a 

and audio data. 55 video stream, thereby providing for scalable video storage, 

Based on the stream group control provided by the local distribution and presentation rates, as discussed below. This 

stream controller, the stream interpreter generates 62 time feature is particularly advantageous because human percep- 

stamps for the captured streams and using the time stamps, tion of audio is much more sensitive than that of video. For 

creates 64 corresponding stream tokens of video and audio accurate human perception of audio, audio samples must be 
presentation units with embedded time stamps. The stream €0 presented at a smooth and continuous rate. However, human 

tokens are then enqueued 66 to stream pipes 32 for con- visual perception is highly tolerant of video quality and 

sumption by the stream I/O manager 26. frame rate variation; in fact, motion can be perceived even 

The piped streams are accepted and dequeued 72 by the despite a wide variation in video quality and frame rate, 
stream I/O manager 26. and then scaled. If the streams are Empirical evidence shows that humans can perceive motion 
to be stored in interleaved form, they are then interleaved 76, w if the presentation rate is between 15 and 30 frames/sec. At 
in a process which reverses the functionality depicted in lower frame rates motion is still perceivable, but artifacts of 
FIG. 7. The streams are not required, of course, to be stored previous motions are noticeable. 
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The DVMS of the invention exploits this phenomenon to Base Level Implicit Timing Synchronization 

optimally utilize available computing, compression and net- As explained above, the base level synchronization 

work resources; by prioritizing the retrieval, transmission, scheme assumes that there is no need for control of stream 

decompression and presentation of audio over video within flow to the stream interpreter, and thus does not monitor for 

a computer system or network computing environment, and s vacancy of the stream pipe. Implicit timing is based on a 

by relying on audio-to- video synchronization before reference time base mat is applied to each stream to be 

presentation, rather than at storage, an acceptable audio rate synchronized. 

can be maintained while at the same time varying the video Considering ( a^ceiiario iir which audio and video streams 

rate to accommodate resource availability in the system or are tcTbe synchronized, each presentation unit for the video 

network. Additionally, independent management of audio 10 stream to be presented might typically contain video iofor- 

and video data streams provides many editing capabilities, mation to be presented in a frame time of, e.g.. 33 msec, for 

e.g., the ability to dynamically dub a video stream with NTSC video play. The audio stream might typically be 

multiple audio language streams. Similarly, the synchro- divided into fixed frames of presentation time with margin- 

nized presentation of an audio stream with still pictures is ally varying samples per presentation unit In a storage 

provided for by the independent stream management tech- 15 scheme in which the audio and video are interleaved, these 

nique. It must be remembered that all of the synchronization fixed units of time are set as the time duration for a video 

schemes described are applicable to any type of stream, not frame, i.e., 33 msec. 

just audio and video streams. In this synchronization scenario, the stream interpreter 

As described above with reference to FIG. 6. the syn- maintains a separate presentation unit counter for each 

chronization of streams within a stream group is the respon- 20 stream pipe, and correspondingly, for each stream in the 

sibility of the stream interpreter module during a scaling stream group. The mterpreter consumes presentation units 

process. The streams may be self-synchronized using either from the two streams in a round robin fashion, Le., first one, 

an implicit timing scheme or an explicit timing scheme. then the other, and so on. Importantly, an independent 

Implicit timing is based on the fixed periodicity of the presentation synchronization decision is made for each 

presentation units in the constituent streams of a stream 25 presentation trait or token, of each stream, based on a 

group to be synchronized. In this scheme, each presentation corresponding reference time base, without regard to other 

unit is assumed to be of a fixed duration and the presentation streams. This reference time base indicates the current real 

time corresponding to each presentation unit is derived time relative to the start time of the presentation unit 

relative to a reference presentation starting time. This ref- consumption process for the corresponding stream. The 

erence starting time must be common to all of the constituent 30 stream counter of each stream pipe indicates the number of 

streams. Explicit timing is based on embedding of presen- already consumed presentation units in the corresponding 

tation time stamps and optionally, presentation duration stream. Multiplying this count by the (fixed) duration of 

stamps, within each of the constituent streams themselves each of the presentation units specifies the real time which 

and retrieving the stamps during translation of streams from has elapsed to present the counted units. When this real time 

the storage format to the token format The embedded time 35 product matches the current reference time, the next pre- 

stamps are then used explicitly for synchronization of the sentation unit is released for presentation, 

streams relative to a chosen reference time base. The stream Interpreter initiates the consumption and pro 

Using either the implicit or explicit timing self- sentation of each presentation unit in sequence during its 

synchronization schemes, a reference time base is obtained presentation process cycle based on the presentation deci- 

from a reference clock, which advances at a rate termed the 40 sion scheme given in pseudocode in Appendix D. This 

reference clock rate. This rate is determined by the reference scheme implicitly assumes that the stream interpreter is 

dock period, which is the granularity of the reference clock scheduled such mat the interpreter process rate is very close 

ticks. to the nominal presentation rate of the corresponding stream. 

The DVMS of the invention supports two levels of This scheme is based on a comparison of a reference time 

self-synchronization control, namely, a base level and a flow 45 base with the amount of time required to present the number 

control leveL Base level synchronization is applicable to of already-consumed presentation units, and thus requires 

stream process scenarios in which the stream I/O manager is the use of counters to keep a count of presentation units as 

able to continuously feed stream data to the stream they are consumed, 

interpreter, without interruption, and in which each presen- Base Level Explicit Timing Synchronization 

tation unit is available before it is to be consumed. In this 50 As explained previously, in the explicit timing scheme, 

scenario, then, the stream I/O manager maintains a process stream synchronization is based on time stamps that are 

rate and a process work load that guarantees that the stream embedded in the corresponding streams* tokens themselves. 

I/O manager stays ahead of the stream interpreter. The time stomps represent the time, relative to the reference 

The flow control level of synchronization is a modifica- time base, at which the corresponding audio or video pre- 

tion of the base level scheme that provides a recovery 55 sentation frames are to be consumed and presented- The time 

mechanism from instantaneous occurrences of computa- base may be, for example, an external clock, or may be 

tional and I/O resource fluctuations which may result in the generated from the embedded time base of one of the 

stream pipe between the stream I/O manager and the stream streams to be synchronized. The periodicity of the time 

interpreter running dry. This could occur, for example, in a stamps is itself flexible and can be varied depending on 

time-shared or multi-tasked computer environment, in 60 particular synchronization requirements. Time stamps may 

which the stream I/O manager may occasionally fall behind be embedded in the streams during capture and compression 

the stream interpreter' s demand for presentation units due to operations, as described above, or at a later time during, for 

a contention, such as a resource or processor contention, example, an editing process. Independent of the process by 

with other tasks or with the stream interpreter itself. In such which the time stamps are embedded in a stream, the stamps 

a scenario, the DVMS of the invention augments the base 65 are utilized by the stream I/O manager and interpreter during 

level of synchronization with a stream flow control function, playback processes to make the consumption and presenta- 

as described below. tion decisions. The stream interpreter does not maintain a 
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presentation unit counter in this scheme, as it does in the stream I/O manager and stream interpreter perform the same 

implicit timing scheme. Rather, the embedded time stamps operations described above in the base level scheme. As 

in the streams provide equivalent information. explained, the interpreter maintains a separate presentation 

A time stamp for a presentation frame token consists of unit counter for each stream within the stream group being 

two 32-bit integers representing the presentation time and 5 presented, to keep track of the number of already-consumed 

the presentation duration for that presentation unit The presentation units in each stream, Multiplying this count by 

presentation time and the presentation duration are repre- the duration of each presentation unit specifies the time at 

sented in milliseconds. The presentation duration may be which, when matching the reference time, the next presen- 

omitted if all presentation units arc of the some duration. tation unit in the sequence is to be presented. The stream 

In this synchronization scheme, the interpreter reads the 10 interpreter decides on the consumption and presentation of 
embedded time stamp of each presentation token, as mat each presentation unit based on the decision scheme given in 
token is processed, to determine presentation time and pseudocode in Appendix F, which assumes that the inter- 
duration for each presentation unit in the sequence. The preter is scheduled at a process rate that is close to the 
interpreter decides on consumption and presentation of each nominal stream presentation rate. In this scheme, when the 
presentation unit in each stream based on the decision 15 interpreter finds that a presentation token is not available 
scheme given in pseudocode in Appendix E. This decision from the stream pipe, and that the reference time and 
scheme is based on the assumption that the stream inter* presentation unit count indicate that a presentation unit is 
preter is scheduled such that its process rate is very close to needed, a virtual presentation unit is generated and con- 
the nominal presentation rate of the corresponding stream. sumed for presentation. 

This scheme is based on a comparison of a reference time 20 Flow Control Level Explicit Timing Synchronization 

base with the presentation time and presentation duration During a presentation process cycle using the explicit 

stamp embedded in each presentation unit. When a presen- timing synchronization mechanism augmented with flow 

tation unit's stamp presentation time corresponds to the control capability, each presentation token in the stream 

reference time, that presentation unit is consumed for pre- group being presented is assumed to include its own embed- 

sentation. 25 ded time stamp for presentation time and duration. As in the 

In addition to determining the appropriate time for rcleas- explicit timing scheme without flow control, the stream 

ing presentation units in the sequence, both the implicit and interpreter examines each embedded time stamp to decide 

explicit timing schemes delete presentation units if the on me consumption policy of the corresponding presentation 

appropriate release time for those units has passed. For unit in the stream pipes set up by the stream I/O manager, 

example, in the implicit timing scheme, when the product of 30 The consumption policy is determined based on the decision 

processed units and unit duration exceeds the currently scheme, given in pseudocode in Appendix G, which 

maintained time count, the next sequential unit is deleted, assumes, as did the other schemes, that the process rate of 

rather than presented. Similarly, in the explicit timing the stream interpreter is close to the nominal presentation 

scheme, then the current presentation time exceeds the time rate of the corresponding stream. In this scheme, when it is 

stamp presentation time of a presentation unit, that unit is 35 determined that another presentation unit is not available 

deleted, rather than presented. In this way, synchronization from the stream ripe and a unit should be presented, a virtual 

of streams is maintained, even if units arrive for presentation presentation unit is generated based on a default presentation 

at a later time than expected. The Appendices D and E give duration, and that unit is men consumed for presentation, 

corresponding pseudocode for this presentation unit deletion Additionally, in the flow control schemes of either 

function. 40 inmucitOTexpuritujBing«( 

Flow Control Level Implicit Timing Synchronization presentation units. This capability is envoked whenever a 

The flow control synchronization scheme augments the previously unavailable presentation unit later becomes 

base level synchronization scheme to provide for recovery available. In the explicit timing scheme, the time stamp of a 

from instantaneous computational and I/O resource fluctua- later available unit will never match the reference time after 

tioDS during a consume and presentation process cycle. The 45 the presentation of a virtual presentation unit and thus that 

base level scheme relied on the assumption that the stream unit will never be presented, and will be discarded. In the 

I/O manager stays ahead of the stream interpreter to keep implicit timing scheme, the presentation of a virtual presen- 

steam pipes from becoming vacant, or running dry. Flow tation unit in place of an unavailable presentation unit 

control synchronization guards against this condition using advances the presentation unit counter, as does any pre- 

a scheme based on virtual presentation units. 50 sented unit When the unavailable unit is men later available, 

A virtual presentation unit is one which allows the under- the presentation unit count will be advanced such that the 

lying digital hardware subsystem to continue with a default product of the count and the fixed presentation unit duration 

presentation for the duration of a corresponding presentation will not permit presentation of that unit 

unit, while at the same time maintaining a consistent internal Coding of the four synchronization processes described 

state, to thereby provide sequential processing of a stream 55 above and in Appendices D-G into instructions suitable for 

that is being presented, even while the stream pipe is implementing the synchronization techniques will be under- 

temporarily empty. Virtual presentation units may be imple- standable to those having ordinary skill in the art of C 

merited in a variety of embodiments. For example, in the programming, 

case of motion JPEG video, the playing of a virtual presen- Self^Synchronization Features 

tation unit would preferably correspond to redisplaying the 60 The four self-synchronization schemes described above 

most recent previous video frame. In the case of audio provide several critical advantages in the digital video 

streams, a virtual presentation unit would preferably corre- management scheme of the invention. Self-synchronization 

spond to a null unit i.e. . a presentation unit consisting of null accommodates the ability to dynamically associate distinctly 

samples that represent silence. Other virtual presentation stored streams with a common stream group. Thus, for 

unit implementations are equally applicable. 65 example, audio and video streams may be stored in separate. 

During a presentation process cycle using the flow control file containers and grouped dynamically during retrieval 

implicit timing scheme to synchronize stream flow, (he from storage for synchronized presentation. As discussed 
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above, this synchronization of constituent audio and video rate is determined by the available computational resources 

streams provides, for example, for the function of dubbing of a given computer system. Different computer systems 

of video with audio, and synchronizing still .video with having varying performance characteristics require differing 

audio. Additionally, using the stream synchronization amounts of time to accomplish presentation operations, 

technique, stream segments from different file containers 5 Such operations involve decompression, format conversion 

can be dynarnically concatenated into one stream. In the case out f wt ^vict mapping. In particular^ compressed 

of explicit self-synchronization, the stream VO manager ™£™ ™& stream h^ to ^ Huftoan decoded 

markT toe first presentation unit in a stream segment with a df compressed, converted to ROB color space, and 

^JZr;^^r,l^^ rt f o cJcrmpnt Th« mapped to a 256 color VGA palette by the digital hardware 

martom<Ucating Restart of a new stteam segment Tten s Xystem before presentation within an IBM 

whenmestreammt^^ 10 K J ^ J m tem; different com- 

the interpreter reinitializes the reference time base for the ^ variou Ttime periods to accomplish 

corresponding stream. lhesc thc management systcm D f mc invention 

Self^nchronization further acccmimodates adapts to any computer performance characteristics by 
adapted slrews* in the clock rates of audio and video adjusting the scale of the stream flow rate to accommodate 
hardware used to play audio and video streams which are 15 the availability of utilities in that computer, 
being synchronized. For example, an audio stream recorded M 0S t importantly, the stream scalability feature of the 
at an 11, 22 or 33 KHz sampling rate must be played back digital video management system of the invention provides 
at exactly the sampling rate for accurate audio reproduction. the ability to comprehensively manage distribution of digital 
Similarly, a video stream recorded at 30 frames per second streams over packet networks. Thc DVMS exploits this 
must be played back at that same rate. The audio and video 20 capability in a network embodiment providing management 
hardware playing these streams thus must each use clocks protocol schemes for client-server sessions, as well as man- 
adapted for the particular play rate requirement of the agemcnt protocol schemes for storing, accessing, retrieving 
corresponding stream. Any skew in the clock rates would and presenting streams over a LAN or WAN. The system 
cause drifting of the playing streams, and thus destroy thereby accommodates on-demand retrieval and playback of 
synchronization of the streams, if thc skew were to be 25 stored streams, and injection and tapping of multicast live 
uncorrected. Self-synchronization achieves mis correction streams over packet networks. The managed digital streams 
automatically using a reference time base which the audio may be stored in ordinary computer files on file servers, or 
and video time bases are checked against; the consumption may be generated from live analog sources and made 
rate of a stream is adjusted to drop presentation units accessible over a LAN or WAN. Such access may be 
periodically, if necessary, if a skew in one of the time bases, 30 on-demand, as mentioned above, as in retrieval and presen- 
relative to its prescribed correspondence with the reference tation from a stored file, or on-schedule. as in injection and 
time base, is detected, thereby maintaining synchronization tapping from a broadcast channel. The management protocol 
with respect to the reference time base and the other stream. schemes provided by the DVMS will be fully described 

The self-synchronization schemes provide the capability below, 

to vary the inherent presentation rate of streams. For 35 Referring now to FIG. 9, in a network implementation, the 

example, a video stream captured in PAL format, based on local DVMS manager 20 accesses digital media stream 

25 frames per second, may be played in the NTSC format, located elsewhere in the network via the remote DVMS 

which is 30 frames per second, albeit with some loss of manager 82 of the management system; the local DVMS 

fidelity. In general, any stream may be played at a custom manager provides a client operating environment, while the 

rate, independent of the rate at which the stream was 40 remote DVMS manager provides a network operating envi- 

captured. In fact it is often desirable in video playback to ronment Via the network 8#. the local DVMS manager 20 

either speed up or slow down the nominal presentation rate and the remote DVMS manager 82 transmit control mes- 

of the video. Using the self-synchronization technique, the sages and digital media data streams as they are requested by 

video presentation rate may be, far example, sped up by a a computer client connected in the network 

factor of 2 by simply advancing the reference time base to 45 Remote DVMS Manager 

twice the real time rate. Conversely, the presentation may be The remote DVMS manager 82 manages network control 

slowed by half by advancing the reference time base at one of digital media streams via four independent modules, 

half the real time rate. In these cases, the total time elapsed namely, a remote stream controller 84, a remote stream 

for the presentation will be. of course, one half or twice the input/output (I/O) manager 86, a remote network stream VO 

elapsed time for the presentation made at the nominal rate, so manager 88. and a local network stream VO manager 90. 

Stream Scalability In this DVMS network implementation, the local DVMS 

A scalable stream is a stream that can be played at an manager 20. residing locally to a client computer in the 

aggregate nominal presentation rate with variable data rates, network, comprises a local stream controller 24, local stream 

under computer control. Of course, variation in the data rate I/O manager 26 and local stream interpreter 28. The local 

may affect the quality, fidelity or presentation rate of the 55 network stream VO manager 90 of the remote DVMS 

stream. The coupling of stream scalability with stream manager directly interfaces with the local DVMS manager 

self-synchronization provides a powerful control mecha- locally. 

nism for flexible presentation of audio and video stream The remote stream controller 84 resides on a remote 
groups. storage device or access point, e.g., a video server, in the 
As discussed above, scalability allows the DVMS to 60 network. This controller is responsible for managing the 
optimize utility of computer system resources by adjusting remotely stored streams, e.g.. video files, and thereby mak- 
stream rates according to utility availability. In the case of ing them available for on-demand access by the local stream 
audio and video streams, the stream interpreter may be controller module of the local DVMS manager. Client- 
programmed to give higher priority to audio streams than server session management protocols control this access, 
video streams, and thus consume audio presentation units at 65 The remote stream controller also provides a link for feed- 
the nominal audio presentation rate, but consume video units back control from the local DVMS manager to the remote 
at an available presentation rate. This available presentation DVMS manager, as described below. 
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The remote stream I/O manager 86 also resides on a the corresponding remote stream I/O manager 86 and remote 

remote server; it is responsible for dynamically retrieving network stream I/O manager 88 to handle retrieval and 

and storing streams from or to a storage container in the transmission of the constituent streams within the accessed 

remote storage server. Efficient access to stored stream stream group. 

information and handling of file formats is provided by this 5 The stream read ahead operation is employed to reduce 

module. Thus, the remote stream I/O manager performs the latency perceived by a client when a stream group presen- 

samc tasks as those performed by the steam I/O manager of tation is begun; stream retrieval, transmission, and scaling 

the local DVMS manager in a stand-alone computer require a finite amount of time and would be perceived by 

implementation — tasks including translation between stored a client as a delay. In the read ahead operation, the remote 

stream representations and corresponding dynamic 10 stream I/O manager, the remote network stream I/O 

computer-based token representations. manager, and the local network stream I/O manager retrieve, 

The remote network stream I/O manager 88, implemented transmit and scale the streams at the very start of a client- 

on a remote server, regulates transmission of streams across server session, even before the client requests stream pre- 

the network to and from a local DVMS manager with which sentation. In this scheme, the stream are ready for immediate 

a communications session has been initiated. This transmit 15 consumption by the local stream interpreter* via the stream 

sion comprises stream exchange between the remote net- pipes, whenever a user specifies the start of presentation, and 

work stream I/O manager 88 and the local network stream possible presentation delays are thereby eliminated or mini- 

I/O manager 90, which resides locally with respect to the mized. 

local DVMS manager modules, on a client in the network Referring now to FIG. 10, when a network client requests 

Stream transport protocols control the transmissions. The 20 access to a specified stream group, the following procedure 

local network stream I/O manager 90 receives streams from is implemented. Upon initialization from the request, and 

the network and delivers them to the local DVMS stream based on the network servers' stream group advertisements, 

interpreter 28 during playback processes; conversely, it the appropriate remote stream I/O manager 86 retrieves 

receives streams from the local stream interpreter and trans- stored streams, e.g., audio and video streams, from the 

raits them over the network during recording and storage 25 appropriate file storage 30 containing the requested stream 

processes. group. The manager then separates the retrieved streams, if 

The DVMS of the invention provides protocols for man- necessary, thereby producing separate audio and video pre- 

aging the interaction and initialization of the local DVMS sentation unit streams, and enqueues corresponding stream 

manager modules and the remote DVMS manager modules descriptor tokens in separate stream pipes 87. one pipe for 

just described. Specifically, four classes of protocols are 30 each presentation unit token stream, 

provided, namely, access protocols, for stream group nam- The remote network stream I/O manager 88 consumes the 

ing and access from a stream server or injector; transport presentation unit tokens from each of the stream pipes* 

protocols, providing for stream read-ahead, and separation assembles transmission packets based on the streams, and 

and prioritization of streams; injection/tap protocols, pro- releases them for transmission across the network 80 

viding the capability to broadcast scheduled streams, e.g., 35 directly to the corresponding local network stream I/O 

video streams, to selected network clients; and feedback manager 90, based on the DVMS stream data transport 

protocols, accommodating the management of adaptive protocols; the particular transport protocol used is set by the 

computational resources and cominunication bandwidths. network environment. For example, in a Novell® network. 

When the DVMS is configured in a network environment the Netware SPX protocol is used for stream data transport 
remote media data stream file servers in the network adver- 40 The local network stream I/O manager 90, upon receipt of 

tise the stream groups controlled in their domain based on a the transmitted presentation units, queues the presentation 

standard network advertisement protocol. For example, in units in separate stream pipes 32 for each stream to be 

the Novell® Netware™ environment, servers advertise consumed by the local stream interpreter 28 for use by the 

based on the Service Advertisement Protocol (SAP). Each client computer's digital media hardware subsystem 34. 

video server is responsible for a name space of stream group 45 Referring to FIG. 11A, illustrating the remote DVMS 

containers that it advertises. functions in more detail, upon initialization, the remote 

As shown in FIG. 9. when an application running on a stream controller 84 initializes the remote stream I/O man- 
computer (client) connected in the network opens a stream ager 86 and the remote network stream I/O manager 88 by 
group container by name to access the container contents, creating 130, 136 active modules of each of the managers, 
the DVMS initializes the carrcspooding local stream con- 50 It also specifies 132 the requested stream group for access by 
troller 24 of the local DVMS manager to access the corre- the two managers. Control 134 of the specified stream group 
sponding stream group. The local stream controller then sets is provided throughout the duration of the managers' func- 
up a client-server session with the appropriate remote stream tions. 

controller 82 based on the stream group container name that The remote stream controller 84 also provides manage- 
the application wishes to access and the remote server's 55 ment 138 of the client/server session which proceeds 

advertisement The local stream controller may access mul- between the local and remote DVMS systems as a result of 

tiple stream group containers during a single session. This the stream group request Based on information provided by 

capability results from the name service architecture the local DVMS manager which requested the stream group, 

employed by the remote DVMS manager In this scheme, a the remote stream controller receives 140 a desired rate 
domain of container names is accessed via a single access 60 value from the local DVMS; this rate value indicates the rate 

call, whereby multiple containers in the domain are simul- at which the streams are to be presented, and is explained 

taneously available for access. more fully below. The remote stream controller specifies 142 

The local stream controller 24 then initializes the local this rate to each of the remote stream I/O manager 86 and the 

network stream I/O manager 90 of the remote DVMS remote network stream I/O manager 88, which each receive 
manager, and commences a stream read-ahead operation, 65 144 the rate. 

described below, with the appropriate remote stream con- The remote stream I/O manager 86 retrieves, separates, 

troller 84. In turn, that remote stream controller initializes and scales 146 audio and video streams from the appropriate 
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stream container 30. If the streams were stored separately. units as they are received. Then it creates 170 stream tokens 

rather than interleaved, the streams may be individually from the received packets and enqueues 172 them to indi- 

scaled at this point while if the streams were interleaved, the victual stream pipes. The stream interpreter 28 dequeues 176 

remote network stream I/O manager 88 later scales the the tokens from the stream pipes and scales 176 the tokens 

streams, as explained in detail below. 5 as required, in a process discussed below. Then using the 

In a process explained previously with reference to FIG. synchronization schemes explained previously, the streams 

7, the remote stream I/O manager creates 148 stream tokens are synchronized 178 and sent to the digital hardware 

corresponding to the stream presentation unit frames subsystem for presentation. The functions of this hardware 

retrieved from storage, and enqueues ISO the stream tokens were explained previously with reference to FIG. 8. 

for delivery to the remote network stream I/O manager via 10 In the reverse process, i.e., when recording streams from 

individual stream pipes 32. a network client for storage on a remote stream server, as 

The remote network stream I/O manager 88 dequeues 152 shown in FIGS. 11A and 11B, the digital stream hardware 

the tokens from the stream pipes and if necessary, scales 154 subsystem provides to the local stream interpreter 28 the 

the tokens. The tokens are then formatted 156 far transmis- stream data, and based on the playing format of the streams, 

slon packets, and released to the network for transmission. 15 the local stream interpreter generates 180 corresponding 

Referring also to FIG. 12, the packet format process 156 time stamps, for use in synchronization and scaling. Stream 

is implemented as follows. Each token 114 in the token tokens are then created 182 and enqueued 184 in the stream 

streams 112 is enqueued in a buffer 118, whereby each buffer pipes. 

contains tokens and associated media frame data from one The local network stream I/O manager dequeues 186 the 

stream only, even if the streams were originally interleaved 20 stream tokens from the pipes and scales 188 the streams 

in storage. Tokens, along with corresponding media data based on their play rate, record rate, and storage format, as 

from the buffers, are then sequentially ordered in packets discussed below. Then packets are formed and transmitted 

120 in such a manner that each token and the corresponding 190 via the network to the remote server location on which 

media data remain associated. This association, along with the corresponding remote DVMS exists, 

the fact that tokens are likely to be time stamped, does not 25 Thereafter, the remote network stream I/O manager 88 

require that the storage format and congruency of me stream receives 192 the transmitted packets and creates 194 stream 

be preserved in the transmission packets during transmis- tokens based on the packets. The tokens are then enqueued 

sion. 196 in stream pipes for consumption by the remote stream 

This packet format scheme provides dramatic advantages I/O manager. The remote stream I/O manager dequeues 198 

over the conventional packet format scheme of the prior art 30 the tokens from the stream pipes, and scales 200 the streams 

In the conventional packet protocol the stored media data if necessary. Finally, it interleaves the streams, if they are to 

format, which is typically interleaved, is preserved in the be stored in an interleaved format and stores 202 the 

transmission packet format Thus, in this scheme, audio and streams in appropriate stream containers on the server, 

video streams are transmitted across a network in packets FIGS. 11A and 11B illustrate that the network implemen- 

containing a sequence of interleaved headers, audio frames, 35 tation of the DVMS of the invention is an elegant and 

and video frames, and thus, the specific syntax by which the efficient extension of the stand-alone DVMS implementa- 

interleaved streams were stored is replicated in the packet tion; this extension is possible as a result of the modularity 

format in design of each processing entity. Specifically, the details 

In contrast, in the packet format scheme of the invention, of packet transport are transparent to the remote stream I/O 

the separation of streams and distinctly formatting of pack- 40 manager; it functions in the some manner as a stand-alone 

ets for each stream provides an opportunity and the facility stream I/O manager. Similarly, presentation unit token 

to examine, process, and make transmission decisions about streams provided to the local stream interpreter do not 

each stream and each presentation unit independent of other contain transmission-specific formats, 

streams and presentation units. As a result, the local DVMS As a result the local DVMS manager, when implemented 

manager can make presentation decisions about a given 45 in a network environment is easily reconfigured to provide 

presentation unit token independent of the other tokens in a remote DVMS manager which includes a corresponding 

the corresponding stream, and can make those decisions remote steam I/O manager, with the addition of a remote 

"oo-the-fly". This capability provides for real time scaling network stream I/O manager, and a local DVMS manager 

and network load adjustment as a stream is retrieved, which includes a corresponding local stream interpreter, and 

processed, and transmitted across the network. The conven- 50 a local network stream I/O manager from the remote DVMS 

tional prior art scheme does not have any analogous facility, manager. Exploiting this modularity, programming applica- 

and thus cannot provide the synchronization, scaling, and tions may be created which are supported by the DVMS 

rate control features of the invention. functionality without them perceiving a functional differ- 

Ref erring to FIG. 11B, once the stream group is trans- ence between a local, stand-alone type stream scenario and 

mitted across the network, the local DVMS manager pro- 55 a remote, network stream scenario, 

cesses the stream group for presentation. The local stream Appendices H, L J, and K together present a C-language 

controller 24 manages 158 the client/server session commu- pseudocode implementation of the client-server session con- 

ni cation with the remote stream controller 84. Like the trol and remote and local stream processing techniques 

remote stream controller, it also creates 160, 162 instances required in addition to those given in Appendices A. B. and 

of active processors, here initializing the local network so C for the network implementation of the DVMS of the 

stream I/O manager 90 and the local stream interpreter 28. invention. Those having ordinary skill in the art of C 

The local stream controller creates 164 the stream grouping programming will understand the coding of theses 

of interest and controls 166 mat group as the local network pseudocode processes into corresponding code, 

stream I/O manager 90 and stream interpreter 28 process the Additionally, as will be recognized by those skilled in the 

group. 65 art these processes may alternatively be implemented in 

The local network stream I/O manager 90 receives 168 hardware using standard design techniques to provide the 

the transmitted network packets and assembles presentation identical functionality. 
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Scalable Stream Rate Control 

In the network embodiment of the DVMS of the 
invention, the remote and local DVMS managers operate 
together to provide control of the rate of flow of streams 
through a network during stream transmission. As men- 
tioned above, this capability is particularly advantageous in 
handling audio and video streams to accommodate fluctua- 
tions in network utility availability by prioritizing audio 
stream rate over video stream rate. 

This priority is based on the premise that human visual 
perception of motion is highly tolerant of variations in the 
displayed quality and frame rate of presented video. 
Typically, humans perceive motion when a video presenta- 
tion rate exceeds at least 15 frames per second Moreover, 
instantaneous and smooth variations in video presentation 
rates arc practically unnoticeable. However, human aural 
perception is quite intolerant of variations in audio presen- 
tation quality or rate. Typically, humans perceive noise when 
a constant audio presentation rate is not maintained, and 
perceive "clicks'* when brief periods of silence are injected 
into an audio stream. Thus, the DVMS system prioritizes 
audio streams over video streams. This prioritization of 
audio over video extends over the entire data flow of audio 
and video streams in a network, starting from their retrieval 
from storage containers and ending with their presentation. 

Control of the rate of streams through a network based on 
this audio prioritization scheme may be initiated 
automatically, or in response to a direct user request Each 
type of control request is discussed below in turn. The 
remote DVMS manager responds to each type in the same 
manner, however. 

Referring again to FIG. 11 A. remote stream controllers 84 
in the network are responsible for instructing the corre- 
sponding remote stream I/O manager 86 and remote network 
stream I/O manager 88 as to what percentage of the nominal 
presentation rate (at which the stream would "normally" be 
presented) the stream should be actually retrieved and 
transmitted The remote stream controller receives 140 the 
desired rate value via network communication with the local 
stream controller 24 and specifies 142 this rate to the remote 
stream I/O manager 86 and the remote network stream I/O 
manager 88. which each receive 144 the rate value. 

The stream rate control mechanism is carried out by either 
the remote stream I/O manager or the remote network 
stream I/O manager, depending on particular stream success 
scenarios. As explained above. If the requested audio and 
video streams are interleaved in storage, in, e.g., the Intel 
DVT AVSS file format, the remote stream I/O manager 
retrieves the streams in that interleaved form, separates the 
streams into distinct streams, and creates corresponding 
presentation unit tokens. The remote stream I/O manager 
does not, in this scenario, have the ability to manipulate the 
streams distinctly because they are retrieved interleaved In 
this case, the remote network stream I/O manager, which 
obtains the streams from the stream pipe after they have 
been separated controls the rate of each stream as before 
forming stream packets for network transmission. 

If the streams to be retrieved are individually stored the 
remote stream I/O manager may control the rate of the 
streams as they are each separately retrieved and corre- 
sponding tokens are created In this case, the rate control 
functionality of the remote network stream I/O manager is 
redundant and does not further change the stream rate before 
the stream is transmitted across the network. 

Rate control of noninterleaved streams is provided by the 
remote stream I/O manager during the scaling process 146, 
in which case the remote stream I/O manager retrieves 
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stream frames from the storage container while skipping 
over appropriate stream frames to achieve the prespecified 
stream rate. The streams frames which are skipped over are 
determined based on the particular compression technology 
5 that was applied to the stream. The remote stream I/O 
manager substitutes virtual presentation units for the skipped 
stream frames to maintain sequential continuity of the 
stream. 

As explained previously regarding flow control synchro- 
nization schemes, a virtual presentation unit comprises a 
presentation unit with some amount of substitute media data 
information for maintaining a consistent internal state of 
stream unit sequence, even while a next sequential unit is 
unavailable. Here in the case of scaling, where virtual units 
are employed to scale the transmission rate of streams. 

15 virtual units are additionally employed to reduce the amount 
of presentation unit data that is transmitted 

Accordingly, here a virtual video presentation unit com- 
prises a null presentation unit having a specified presenta- 
tion duration and time, or a time stamp, but not containing 

20 any frame presentation information. Then, when the remote 
stream I/O manager substitutes a virtual presentation unit for 
a skipped stream frame, a transmission packet including the 
virtual presentation unit is shorter and more quickly trans- 
mitted than it would be If the skipped frame was included 

25 When the local stream interpreter and digital presentation 
subsystem receive and process the null video unit, they 
interpret that unit as an instruction to represent the most 
recently presented frame. In this way, the presentation 
subsystem maintains default video presentation data without 

30 requiring that data to be received via a network transmission. 
As will be recognized by those skilled in the art of 
compression technology, it is alternatively possible, using 
appropriate compression techniques, to substitute partial 
media information, rather than null information to increase 

35 or decrease the transmission rate of presentation streams 
containing presentation units that will not be presented. 

Rate control of interleaved stream is provided by the 
remote network stream I/O manager upon receipt of the 
stream tokens form the stream pipes. Here, the remote 

40 network stream I/O manager scales 154 the stream tokens as 
they are processed to form transmittal packets. This is 
accomplished by processing the stream in a scheme whereby 
the remote network stream I/O manager skips over appro- 
priate tokens and substitutes virtual presentation unit tokens 

45 in their place, depending on the compression technology 
used to achieve the specified stream rate. 

In this common and Important situation of interleaved 
stream storage, the remote network stream I/O manager 
participates in stream data flow and thus may be character- 
so ized with a particular process cycle and process period 
During each of its process cycles, the remote network stream 
I/O manager processes a single presentation unit and deter- 
mines if the next sequential presentation unit is to be 
transmitted based on a transmit decision scheme. Like the 

35 process decision schemes described above in connection 
with synchronization techniques, the transmit decision 
scheme is implemented based on the timing technique of the 
stream being processed; if the stream presentation units 
include embedded time stamps, then the transmit decision 

60 scheme is based on an explicit timing count, while implicit 
timing counting is employed otherwise. 

No matter which agent provides the scaling function, only 
video streams are scaled while audio stream presentation 
frames and tokens are processed at the full nominal presen- 

65 tation rate, without skipping any audio presentation frames; 
this preservation of audio presentation rate inherently pri- 
oritizes audio streams over video streams. 
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The scaling function is, as explained above, dependent on the direction of (he change, i.e.. larger or smaller. Thereafter, 

the compression technology employed for a particular frame it notifies 208 the local stream controller 24 of the change 

or stream group. Using, e.g., a bey frame-based compression and requests a new stream presentation token rate to be 

technique, a key frame is an independently selectable frame transmitted as a percentage of the nominal presentation rate, 

within a stream that contains information required for 5 based on the change. In turn, the local stream controller 

decompression of all the following non-key frames depen- transmits the request to the remote stream controller 84, 

dent on that key frame. Dependent, or non-key, frames are which in response, instructs the remote stream I/O manager 

not independently selectable. The motion JPEG format 86 and me remote network stream I/O manager 88 to adjust 

relies on a scheme in which every frame in a stream is a key the stream presentation unit rate to the requested rate, 

frame. During the scaling operation, only key frames are 10 ^ rate u based on the average queue size in 

skipped over, whereby all non-key frames associated with ^ foUowmg sc hcme. When the queue size increases sig- 

the skipped key frame are also skipped over. Null frames arc nificantly above a prespecified upper availability, the 

men substituted for the key frame and all of its correspond- rcqucste d rate is increased; the increased availability indi- 

ing non-key frames. cates that high-speed processing may be accommodated. 

Appendices L and M provide C-language pseudocode 15 Conversely, when the queue size decreases significantly 

implementing an implicit timing rate control scheme and an a prcspccific< i lowcx availability, the requested rate is 

explicit timing rate control scheme. like the synchroniza- decreased; the decreased availability indicates that the cur- 

tion techniques described previously, the implicit rate con- rcnt ratc ^ accommodated and that a lower rate is 

trol scheme is based on a counting technique and does not preferable 

require ^bedded dree codes on the stream presentation 20 Mtaaatiyd a user ^ a ^irei stream pre- 

frames. TT.e expLcit rate control scheme u based on the use * specfficiioVbetog accepted 204 by Ae 

of me stamps for explicitly detomuung the prcsentoUon ^^^££^1**. me locals^ controller 
and duration time of a given f ame In either 

implementation, virtual presentation units are generated to mcn tation 
accommodate skipped stream frames. 25 

In addition, in either impiementaUon, when skipped In the corresponding reverse process, in which stream 

stream frames later become available, they are identified and frames « stored ***** ^iDg recorded via the local DVMS 

skipped over, thereby being deleted, rather than presented. manager, the remote stream I/O manager scales 200 the 

This presentation unit deletion function, like that employed stream before storage to reconstruct the stream such that it 

in the synchronization schemes, maintains a current sequen- 30 no longer includes null frames. This function may also be 

tial stream progression. Appendices L and M provide accomplished by the local network stream I/O manager in a 

pseudocode for implementing this presentation unit deletion scaling process 188 completed before a stream is transmit- 

function. tcd * 

Adaptive Load Balancing The DVMS of the invention has been described with 

The DVMS of the invention includes the ability to auto- 35 particular detail relating to a preferred embodiment Other 

matically and dynamically sense the load of a packet net- embodiments are intended to fall within the scope of the 

work in which the system is implemented. Based on the invention. For example, while the DVMS of the invention 

sensed loading, the stream rate control mechanism described has been described in a scheme for m a nagin g audio and 

above is employed by the system to correspondingly and video streams, other media data stream types, e.g., stills, 

adaptively balance the load within the network, thereby 40 accessed from various media data access points, e.g., a PBX 

optimizing the network utility availability. server, are within the scope of the claims. If the DVMS is 

Referring to FIG. 11B, in the this load balancing scheme. implemented on a computer system or network in software, 

the local network stream I/O manager 90 monitors 206 the programming languages other than the C programming 

stream pipes 32 currently transmitting streams between that language may be employed, as will be clear to those skilled 

manager and the local stream interpreter 28 for variations in 45 in the art erf programming. Alternatively, the DVMS may be 

the average queue size, i.e., availability of presentation unit implemented entirely in hardware using standard digital 

tokens, of each pipe. When the average queue size varies design techniques, as will also be clear to those skilled in the 

significantly, the local network stream I/O manager detects art of digital hardware design. 
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Appendix A 

Local Stream Controlkr 

Local ^Stream ^Controller f ... ) { 

CNTRLR_MSG message; /• Stream Controller Message structure V 
initialize (■■.); 
while ( for jew ) { 

message = receive ^message (...); 
switch ( message. operation ) { 

case OPEN: /* Open a &r«ffl Gwa/? Player instance */ 

hStr earn ^Interpreter = Create _StreamJaterprtter ( ... >; 
hlMcal_StrtamJO_M onager - Create _Local_Stream J (>_Ma nager ( ... >; 
HLocaJ JVctwork jStrenm J ( >_Maaager - 

Create _Local^Setwork_Siream JO _Manager { ... )\ 

break; 

case CLOSE: /* Close a Stream Group Player m&iance •/ 

delete ^Stream J *t*r prefer ( hStreamJnterpreter )\ 

Delete _£o<al_SlreamJO_tfanag*r ( hLncalJ!treamJO_Manager >; 

Delete _Loca!_NetworkJStreamJO_Maaager ( hLocalJ^etwork JitreamJ O _Maaager J 

break; 

case 'LOAD:/* loxid a Croup by name */ 

hStrtam_Croup ~ Create JitreamjGroup ( sStream Jiroop_Con tainer, ... 
If ( local { hSbramjGroup )) [ 

senj_message ( kLocal _StreamJO_tf onager, LOAD, hStream^Croup, ...); 

> 

else { 

/• Find and connect to the Remote Stream Confrvlter •/ 
hRemoU_StreamjControUer - find ( h Stream _Gro up, ... >; 
connect (HRemoUjUreamjControlUr, ... 
/* Open a remote Stream Group player instance */ 
send_messoge ( hRemote_StreamjConirotler t OPEN, ... ); 

Al 
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/• Initiate a remote loading of the Steram Group and 

obtain a handle lo the Stream Transport Channel ♦/ 
sead_message ( HRemote_StrcamjControUer t LOAD, HStream .Group, 
phStreamjChanncU »■ h 

/* Pass the handle lo the Stream Transport Channel to the Local Network Stream I/O 

Manager and fill the Stream Pipes from the network */ 
send_message { hLocal JVetwork_Str*am JOJAanag er, LOAD, h Stream _Group. 
*phSiream ^Channel, ... J; 

} 

break; 

case UNLOAD: f* Unload a Arram Group by handle*/ 
break; 

case PLAY: /* Play forward the loaded Stream Group ♦/ 

{((local ( hStreamjGroap )) { 

send jnessase ( hLocal _Stream JO ^Manager, PLAY, hStrcamJiroup, ».)\ 

) 

t\st{ 

send. message ( hRemoU_Stream ^Controller, PLAY, h Stream _Group, ... ); 
scndjrttssage ( hLocal Network JStream JOJtfanager, PLAY, h Stream _Gro up, ... 

} 

send jntxiag* (hStreamJnUrpreter, PLAY, ... ); 
break; 

rase STOP: /* Stop and rewind the loaded Stream Group */ 
break; 

case PAUSE: /* Pause the playing Strewn Group •/ 
break; 

) 

) 

/* End lx*ca!Jitrcam_Contrt>Utr ( ) V 
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Appendix B 
Stream I/O Manager 
Stream JO_Managcr (...)[ 

Int nStreams; /• Number of independent Streams in a Stream Group */ 
IOMCR.MSG mstsage; f* Stream I/O Manager Message structure ♦/ 
initiaitze (...); 
while (for^ever) ( 

message = receive ^message ( ... J ; 
switch ( uussage.operaiion ) { 

case LOAD:/* Load a Art a/fl Group and fill the S/re<wi P/pw V 
5<a/e = LOADED; 

/* set number of independent st reams in the Sir earn J}roup*f 
nSreams = message. hStream_Group.nStreams* f 

break; 

c*.se UNLOAD: /* Unload a A/ream Croup and clear ibe Stream Pipes •/ 
dttr = UNLOADED; 



break; 

case PLAY: /* Start retrieving data and feedini the Stream Pipes */ 
Jtaf« = PLAYING; 



break; 

case PAUSE: /* Stop retrieving data and feeding the Stream Pipes •/ 
jloJc o LOADED; 

break; 



I 

If ( state « PLAYING ) ( 
int i; 

for f i = tf; i <= nStrcams; »++ ) { 



Bl 
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Retrieve ( next pre st Motion mil l\ 
Enqueue f n&xt presentation unit >: 



} 

) 

} /• End StreamJOJMtmager ( ) V 
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Appendix C 
$tream Interpreter 
Streant_Interpreter ( ) { 

Int nStrcams; /♦ Number of independent Streams in a Stream Group ♦/ 
INTRPRTR_MS<; message; /* Stream Interpreter Message simcture */ 
initialize ( ... ); 
wWk(/w_ewJ { 

message m receive jnersage ( ... ); 
switch ( message. ope ration ) { 

case LOAD:/* set number of independent streams In the StreamJJroupV 
nSreamx = message.hStream_Craup.nStreatm\ 

break; 
cas« UNLOAD: /• */ 

break; 
case PLAY; /* */ 

break; 

case PAUSEt /♦ V 

break; 

I 

If (states PI AYINGH 
Int i; 

fur ( i = 0; i <s itStreanv; i>+ ; { 

Present (...); /* Present Ihe ««/ presentation unit */ 

) 

) 

CI 
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> 

} /♦ End Streanjnterpnler ( ) »/ 
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Appendix d 

Base Level Implicit Timing Synchronisa tion 

#defiite T < fixed presentation duration of a presentation unit > 

Inl p \ /* consumed presentation units */ 

int /; /* reference time base •/ 

Present (...) [ 

boolean done = FALSE; 

If (t < p*T) ( /• Continue presenting current presentation unit */ 
return; 

> 

while (Idone) { 

/♦ Consume and play a new presentation unit *f 
if ({p *T <= t) <fi* (t < (p+t)*T» [ 

Consume jmd_Prcse*t i f next presentation unit ); 

P=P + i; 

done » TRUE; 

/* Catch up to current lime relative to reference time base */ 

Consume _and,PrfKest 2 ( next presentation unit )* t 
p=p + l\ 

) 

} 

} /• FmA Present ( ) V 



'Consume and Present operation refers to any decompression and processing required for presentation. 
2 Conw«ne and Procuss operation includes dccompressiim and internal state maintenance for algorithms using 
temporal prediction. 
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Appendix p 

Base Level Explicit Timin g Synchronization 

#d«flrw T < fixed presentation duration of a presentation unit > 

int px /* presentation time of next presentation unit */ 

int d\ /* presentation duration of next presentation unit */ 

int f; /* reference lime base */ 

Present (,.. ) { 

boolean done = FA USE; 

if (i </>+<*;( 

/• Continue presenting current presentation unit *t 
return; 

> 

while (/done) { 

/* Get new presentation time and duration */ 
p = presentation Jime ( next presentation unit ); 
4 = presentation _du ration ( next presentation unit); 
/* Consume and play a new presentation unit */ 
iT((p<=t)&& (t<(p-Hl)))[ 

Consume jmd^Present ( next presentation unit >; 
done = TRUE; 

) 

It Up + d) <s> /; ( /• Catch up to current time relative to reference time hose •/ 
Consume _and_Proce$s ( next presentation unit )\ 

\ 

I 

} /* End Present l > */ 



El 
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Appendix f 

Flow Control Implicit Timing Synchronization 
((define T < fixed presentation duration of a presentation unit > 
int p; f* cotisuDwd presentation units V 
Int fj /* reference time base */ 

Int vpu_co**t\ /* differential count of virtual presentation units ♦/ 

Present t .„) { 

boolean done = FALSE; 

while {rpujenunt) { /* Consume ainl drop redundant presentation units V 
Consume _ut%d_Proc a* ( next presentation unit): 
vpujeount"! 

} 

irff <p*T) { /♦ Continue presenting current presentation unit V 
return; 

) 

wbik (tdone){ 

If (Stream Sip* /= EMPTY) { 

tt((p+T<*t) && (t < (p*l)*T)) { 

/• Consume and play a new presentation unit ♦/ 
Consume _atut_Present f nejrr presentation unit h 

/» = » + /; 

doner = TRUE; 

) 

/• Catch up to current time relative to reference time base V 

Consume jatut_PnKess ( next presentation unit >; 
p-p ♦ J; 

} 

else { 

H ((p*T <= t) &<t f/ < (p+t) m T)) { 

t* Genomic and play a new presentation unit ♦/ 
Fabricate virtual presentation unit); 
vpujcoumt++\ 

Consume _and_Present ( virtual presentation unit ); 
done = TRUE; 
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P = p + I: 

) 

) 

) 

) /* End Present ( ) V 
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Appendix G 

Flow Control Explicit Timing Synchronization 

#dcnne /> < default presentation durau<xi of a \iriual presznuvion unit > 

Int p; /* presentation lime of next presentation unit V 

Int </; /* presentation duration of next presentation unit *f 

int <; /* reference time base */ 

Present (...){ 

boolean done = FALSE; 

/* Continue presenting current presentation unit m J 
return; 

} 

whl\* (fdon*) { 

\T(Stream_Pipe !~ EMPTY) { 

/* (let new presentation time and duration */ 
p = presentation Jime ( next presentation unit); 
d = presentation ^duration ( ntxt presentation unit J; 
If ((p <= t) && (t < (p+4))){ /* Consume and play a new presentation unit V 
Consume_andJ>restnt ( next presentation unit ); 
done ° TRUE; 

\ 

if ((p + d) <= t) { /* Caleb up to curreni time relative to reference time base */ 
Consume _asul J'rocess ( next presentation unit )\ 

) 

} otse { 
P=P+d; 
rf = i>; 

K((p<=i)&&(t<(pHi)M 

Fabricate ( vinttai presentation unit )\ 

Consume _ftnd_Present ( virtual presentation unit )\ 

done = TRUE; 

} 

) 
) 

} t* End Present ()V 

Gl 
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Ap pendix H 

Remote Stream Controller 
Remote JttntamSQrtrolt* r (...){ 

bit nStreasns; /* Number of independent Streams in a Stream Grrtup V 
CNTRLJCMSG message; /* Scream Controller Message structure ♦/ 
initialize ( ... )\ 
while (far_ever) { 

message - receive jaessage ( ... )\ 
switch ( message, operation ) { 

case OPEN: /* ( )per a Stream Group Player instance */ 

HRemote ^Stream JO ^Manager = Create Jicmote_Stream JO _Managtr f ... J; 
HRemote ^Network Jitream JO J^anager = 

Cr«a/l JtemoteJVetwork_SircamJO_Manat>er ( ... >; 

break; 

case CLOSE: /* Close a ttrcuin Grai</> Player instance V 

Delete _RemuU_Stream JO _Manager { HRemote ^Stream JO, Manager ); 
DeicUjtemoteJfttworkJitreamJO_Ma*ag*r 

( HRemote Jittworkjitream JO Manager ); 

break; 

case LOAD:/* Load a Stream Group by name */ 

HStreamjGroup = Create JitreamjGroup ( sStream _Grottp_Container, ... ); 
sendjmessage ( HRemote JUrtam JO Jianager, LOAD, HSirtamJjroup, ... J; 
/* Obtain a handle u> the stream channel from ihe Rctttote Network Stream I/O Manager *J 
send .message ( HRemote _Setwork ^Stream JO. Manager, LOAD, kStream_Group, 

phStrcam ^Channel, ... >; 
A* Reply to ihe ImvoI Stream CotnwUcr and return a handle to the f/reewn channel */ 
*messcgt.ph Stream _Ckanne I <= *phStream .Channel 
reply _message f message. sender, ... j; 

break; 
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case UN I, O Aft /♦ Unload a Stream Group by handle*/ 
break; 

case PLAY: /* Play forward ihe loaded Strewn Croup */ 

sendjmestage (kRemote, Stream JO_Managrr t PLAY, 
send^mesxa^e (hRemote_Setwork_Str*m>JO_Manaiter t PLAY, 

break; 

case STOP: /* Slop and rewind the loaded Stream Group ♦/ 
break; 

c*s* PAUSE: /* Pause ihe playing Stream Group *t 
break; 

} 

) 

} /* ttcmote_Siream_C<mtro(ler () */ 
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Appendix K 

Remote Stream I/O Manager 

Remot*_Stream JO ^Manager {...){ 

int *Streams\ /* Number of independent Streams in a Stream Croup */ 
TOMGR_MSG message; /* Stream I/O Manager Message structure */ 
initialize ( ... >; 
while f forjeftr ){ 

message = receive jncisage { ~. >; 
switch ( message, operation ) [ 

case LOAD:/* fxsad a J/rf j/n Gfnu^ and fill the Strewn Pipes */ 
state - LOADED; 

/* set number of independent Streams in the Stream Group*/ 
nSreams = message. hStream_Group,nStreams ; 

break; 

case UNLOAD: /* Unload a Stream Group and clear the Stream Pipes •/ 
itote = UNLOADED; 



break; 

case PLAY: /* Start retrieving data and feeding the Stream Pipes •/ 
state = PLAYING; 

break; 

case PAUSE: /• Slop retrieving data and feeding the Stream Pipes */ 
state - LOADED; 

break; 

) 

iff state = PLAYING ) { 
Int i; 

for f i s= 0; i <= nStreams; i>+ ){ 
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Knqiuu* ( next presentation unil )\ 

) 

) 

) 

| /* End Remote Stream JO_Manager *t 
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Remote Netw ork Stream I/O Manager 

Rrmate_Neiwork_StreamJO_Manager( ... ) ( 
Int nStreams\ /♦ Number of independent Streams in a am Group V 
lOMGR_MST. mexsage; /♦ Stream I/O Manager Message structure */ 
Int h$treamjChannet\ /* Handle to the Stream Transport Channel V 
initialize { ... ); 
while (forjewer) ( 

message = receive jnessage (...); 
switch ( message-operation ) { 

case LOADi/* IxkkI a Gwap and fill the Srream />i/>e.r */ 

jfci/e s= LOADED; 

/• set number of independent streams in the Stream_Group*i 

aSr earns - message. HStr earn ^Group.nStreams, 

/♦ Create a separate Srrefl/n Transport Channel for data flow •/ 

hStream ^Channel * Create ^StreamJZhannel (message. hStrc am _Gro up); 

/♦ Reply to the tow** 5/rfam ContwlUr ttuX return the &rcam Channel*/ 

* message. phStrzam ^Channel = hS/ream_Ckannel\ 

reply _message ( m€ssage.temder t ... >; 

break; 

case UNLOAD: /* Unload a Areum Group and clear the Stream Wpw */ 
j/ate = UNLOADED*, 

break; 

case PLAY: /• Stan retrieving data and feeding the Stream Pipes */ 
suae ~ PLAYING; 

break; 

case PAUSE: /• Stop retrieving data and feediog the Stream Pipes *J 
state = LOADED; 



break; 
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] 

If ( state = PLAYING ) ( 
Inl i; 

for ( i = 0; i <= n Streams; i++ > f 

Transmit ( HStrtamjChannet, next prt sent at ion unit); 

) 

\ 

) 

) /* End Remote _!Sctwork_Strtam JO _Manaxcr ( ) *t 
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Appendix K ; 
Local Network Stream I/O Manager 

laical JSctwork _Siream JO JAajta&r (...){ 

Int nStreamx; /* Number of in Jcpendeni Streams in a Stream Gump */ 
lOMGRJUSt; message; f* Stream I/O Manager Message structure V 
int hStreamjChannth i* Handler to ihc Stream Transport Channel */ 
initialize f ... J; 
while f ferjever ) { 

message — receive ^message ( ... )\ 
switch ( message ) { 

case LOAD:/* Load a Stream Croup and fill the Srr«a/H Pipes */ 
sfefe = LOADED; 

/* set number of independent streams in ibe Stream_Gmup*/ 

mSreams = message.h$tream_Group.nStream$\ 

/• Receive a separate J/rwwi Transport Channel for data flow */ 

hStream _Channel = message. HStream ^Channel; 

/* Kind and connect to the Renwte Network Stream I/O Manager */ 

hRemote_Networh_Stream JO_Managcr = /j#wf ( kStreamShannel 

connect (hRemote _Nctwork Jitream JO JHanagcr, ... J; 

break; 

case UNLOAD; /* Unload a Stream Group and clear the Stream Pipes ♦/ 
sio/r = UNLOADED; 

break; 

case PLAY: /« Start retrieving data and feeding the Stream Pipes V 
ttate = PLAYING; 

break; 

case PAUSK: /• Stop retrieving data and feeding the Stream Pipes */ . 
stale - LOADED; 

hreak; 
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U t stale -= PLAVWiQ ) { 

fnr ( i = 0; i <= aStmunt; i*+ ) { 

Enqueue ( next preseniaiion unit ); 

) 

Fced_Back ( ); 

) 

) 

) /* End lA>caS_NetworkJHreamJ0_Mamag*r ( ) */ 
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Appendix L 

Implicit Timing Rate Control 

bit p = 0; /• consumed presentation units */ 

int r = 0; f* reference time base */ 

Int T; /• Nominal presentation duration V . 

| n t D; /• Requested presentation duration */ 

Transmit (...){ 

boolean done = FALSE; 

*h\\t (Mane) { 

\f((p*T<= t) && (t < (p+l)+T)) { 

/» Consume and transmit the next presentation unit */ 

Consume juutJTnuumil ( next presentation unit)\ 

p = p + li 

done = TRUE; 

} 

if(^+y)*r<=/>{ 

/♦ Adjusl the video rate by transmitiing null presentation units */ 

Fabricate ( null presentation unit )\ 

Consume _and_TnmsmU { null presentation unit )\ 

p=p + l; 

) 

) 

/* Increment virtual stream time */ 

t m f + /); 

} /♦ End Transmit ( ) */ 
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Appendix M 

Explicit Timing Rate Control 

Int p; /* presentation time */ 

Int d\ /* presentation duration */ 

Int l = 0; /* reference virtual lime base'*/ 

Int D\ /* Requested presentation duration */ 

Transmit (*..){ 

boolean done = FALSK; 

while (tdone) { 

p = presentation Jimt { next presentation unit >; 
d = presentation Juration ( next presentation unit ); 
if Up <=t)&&(t<(p + d))){ 

/♦ Consume and transmit the next presentation unit */ 
Cowsum*_and_Trattsmit ( next presentation unit )\ 
done = TRUE; 

\ 

\tiiP + d)**t){ 

/♦ Adjust the video rate by transmitting null presentation uniti */ 

Fabricate ( null presentation unit )\ 

Consume jmd_Trunsmit ( null presentation unit ); 

) 

) 

f* Increment virtual stream lime */ 
f = / + 0; 

) /♦ End Transmit ( ) *t 
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A ppendix N 

Adaptive Loa d Balancing with Feedback 

ftkfine Ney cks < number of cycles over which average is calculated > 

ftdefine N bands < number of bonds the pipe sue is divided into > 

tnl cycles m Q\ /• count of cycles for averaging •/ 

ini average _tum = O; /• couot of running sum for average calculation V 

In* Stream _Pipejsizc; f* size of Stream Pipe measured in presentation units */ 
bit previous ^average _pipe_sitejndex = Nhands; 

int Rate_Table\ Nhands J ; /♦ Table far con verting pipe size index to desired rate V 
teed _£ack [ ) { 

hooteait feedback = FALSE; 
Int average _pipe_jizc\ 
Int average _jripe_MizeJmdex\ 
H( cycles ){ 

average _sum « average _s urn + Stream J*ipe^he\ 
cycle*"\ 

) 

average _pipe_size * {average_sum / Scycles) * 100; 
average j>ipe_sizejndex = average _pipe_size / [N bands; 
ir (average _pip<_size Judex < {previous ^average _pipe _*iz* Judex - l)) { 
feedback = TRUE; 

presentation jiata_rate = toi*_TcW*{fl vemge_p ipejtkejndex ] ; 

> 

ir (average jtipejtize Judex > (previo u s_a veruge_pipe jiizcjndex + 1)) ( 
feedback = TRUE; 

presentation JUUajrate - ftate_Tabte[average_pipe_sizeJndex)i 

) 

previous _average_pipe_size «= average _j>ip*_fize\ 
cycles = Ncyeles; 

} 

if </««ttocJlr){ 

callback ( ft Local Jstr earn _Contr oiler, FEEDBACK, presentation jtnta_rate ); 



Nl 
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What is claimed is: 

1. A computer-based media data processor for controlling 
the tuning of computer processing of digitized continuous 
time-based media data composed of a sequence of presen- 
tation units, each unit characterized by a prespecified pre- 
sentation duration during a computer presentation of the 
media data, the media processor comprising: 

a reference clock which indicates a start time of presen- 
tation processing of the media data presentation units 
and which maintains a current presentation time as the 
media data presentation unit sequence is processed for 
presentation; 

a counter for counting each presentation unit in the 
presentation unit sequence after that presentation unit is 
processed for presentation, to maintain a current pre- 
sentation unit count; and 

a comparator connected to the reference clock and the 
counter, and programmed with the prespecified presen- 
tation duration, the comparator comparing a product of 
the presentation unit duration and the current presen- 
tation unit count, specified by the counter, with the 
current presentation time, specified by the reference 
clock, after each presentation unit is processed for 
presentation, and based on the comparison, releasing a 
next sequential presentation unit to be processed for 
presentation when the product matches the current 
presentation time count and deleting a next sequential 
presentation descriptor in that sequence when the prod- 
uct exceeds the current presentation time count 

2. The media data processor of claim 1 wherein the media 
data presentation unit sequence comprises a video frame 
sequence including a plurality of intracoded video frames. 

3. The media data processor of claim 2 wherein each 
frame of the video frame sequence comprises an intracoded 
video frame. 

4. The media data processor of claim 3 wherein the video 
frame sequence comprises a motion JPEG video sequence. 

5. The media data processor of claim 2 wherein each of 
the plurality of intracoded video frames comprises a key 
frame and is followed by a plurality of corresponding 
non-key frames, each key frame including media data infor- 
mation required for presentation of the following corre- 
sponding non-key frames. 

6. The media data processor of claim 1 further comprising 
a flow controller, connected to said comparator, for receiv- 
ing an indication from the comparator that a presentation 
unit should be released for presentation, determining avail- 45 
ability of a next presentation unit in the presentation unit 
sequence to be processed, and based on that availability, 
generating and releasing a virtual presentation unit of the 
prespecified presentation duration to be presented as a 
default presentation unit in place of a next presentation unit 
when a next presentation unit is not available and until the 
next presentation unit is available. 

7. The media data processor of claim 6 wherein the flow 
controller is adapted to monitor and identify a previously 
unavailable presentation unit when that unit is later 55 
available, and in response to identification of the later 
available unit, withholding the unit from release for 
presentation, whereby the later available unit is not pre- 
sented. 

8. The media data processor of claim 6 wherein the media 
data presentation unit sequence comprises a motion JPEG 
video sequence, the presentation units comprise video 
frames, and wherein each virtual presentation unit comprises 
a most recently presented video frame. 

9. The media data processor of claim 1 wherein the media 
data presentation unit sequence comprises an audio 
sequence. 



10 



15 



20 



25 



30 



35 



40 



50 



60 



65 



10. The media data processor of claim 1 wherein said 
clock is adapted to indicate a start time of presentation 
processing of a plurality of media data presentation unit 
sequences, the start time being common to the plurality of 
sequences, and which maintains a current presentation time 
as the media data sequences are processed for presentation; 

a counter for counting each presentation unit in the 
plurality of presentation unit sequences after that pre- 
sentation unit is processed for presentation, to maintain 
a distinct current presentation unit count for each 
presentation unit sequence; and 
a comparator connected to the reference clock and the 
counter, and programmed with the prespecified presen- 
tation duration, me comparator comparing for each of 
the plurality of presentation unit sequences a product of 
the presentation unit duration and the current presen- 
tation unit count of that sequence, specified by the 
counter, with the current presentation time, specified by 
the reference clock, after each presentation unit from 
that sequence is processed far presentation, and based 
on the comparison, releasing a next sequential presen- 
tation unit in that presentation unit sequence to be 
processed for presentation when the product matches 
the current presentation time count, and deleting a next 
sequential presentation unit in that presentation unit 
sequence when the product exceeds the current presen- 
tation time count whereby the plurality of media data 
presentation unit sequences are synchronously pro- 
cessed for simultaneous synchronous presentation. 

11. The media data processor of claim 10 wherein the 
plurality of media data presentation unit sequences comprise 
an intracoded video frame sequence and an audio sequence. 

12. A computer-based media data processor for control- 
ling the computer presentation of digitized continuous time- 
based media data composed of a sequence of presentation 
units, each unit characterized by a prespecified presentation 
duration and presentation time during a computer presenta- 
tion of the media data and further characterized as a distinct 
media data type, the media data processor comprising: 

a media data input manager for retrieving media data from 
a corresponding media data access location in response 
to a request for computer presentation of specified 
presentation unit sequences, determining the media 
data type of each presentation unit in the retrieved 
media data, designating each retrieved presentation unit 
to a specified media data presentation unit sequence 
based on the media data type determination for that 
presentation unit assembling a sequence of presenta- 
tion descriptors for each of the specified presentation 
unit sequences, each presentation descriptor compris- 
ing presentation unit media data for one designated 
presentation unit in that sequence, all presentation 
descriptors in an assembled sequence being of a com- 
mon media data type, associating each presentation 
descriptor with a corresponding presentation duration 
and presentation time, based on the retrieved media 
data, and linking the presentation descriptors in each 
assembled sequence to establish a progression of pre- 
sentation units in each of the sequences; and 
a media data interpreter, connected to the media data input 
manager, for accepting from the media data input 
manager the assembled presentation descriptor 
sequences one descriptor at a time and releasing the 
sequences for presentation one presentation unit at a 
time, indicating a start time of presentation processing 
of the presentation unit sequences, maintaining a cur- 
rent presentation time as the sequences arc processed 
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for presentation, counting each unit in the sequences video frame and each virtual audio presentation unit cora- 

after that unit is released to be processed for prises a silent audio sample. 

presentation, to maintain a distinct current presentation 23. The media data processor of claim 12 wherein the 
unit count for each sequence, comparing for each of the media data retrieved by the media data input manager 
presentation unit sequences a product of the presents- 5 comprises a storage presentation unit sequence composed of 
tion unit duration and the current presentation unit presentation units for the specified presentation unit 
count of that sequence with the currently maintained sequences, presentation units of the specified presentation 
presentation time after each unit from that sequence is unit sequences being alternately interleaved to compose me 
processed for presentation, and based on the storage presentation unit sequence, 
comparison, releasing for presentation processing a 10 24. The media data processor of claim 12 wherein the 
next sequential presentation unit in that sequence when media data retrieved by the media data input manager 
the product matches the currently maintained presen- comprises a plurality of storage presentation unit sequences, 
tation time count and deleting a next sequential pre- each storage presentation unit sequence composed of pre- 
sentation unit in that presentation unit sequence when sentation units for a specified presentation unit sequence and 
the product exceeds the currently maintained presen- is all presentation units in a storage presentation unit sequence 
tation time count being of a common media data type. 

13 The media data processor of claim 12 wherein the 25. The media data processor of claim 24 wherein the start 
media data access location comprises a computer storage time of presentation processing indicated by the media data 
location. interpreter is common to all of the specified presentation unit 

14 The media data processor of claim 13 further com- 20 sequences, whereby the specified presentation unit 
prising a presentation unit sequence controller for initiating sequences are synchronously processed for simultaneous 
the media data input manager and the media data interpreter, synchronous presentation. 

specifying to the media data input manager and the media ^ The media data processor of claim 25 wherein the 

data interpreter the presentation unit sequences to be specified presentation unit sequences comprise a video pre- 

presented, and controlling starting and stopping of sequence 25 sentation unit sequence of intracoded video frames and an 

presentation in response to user specifiedation. audio presentation unit sequence of audio samples, and 

15. The media data processor of claim 13 wherein the wherein the media data interpreter prioritizes audio presen- 

specified media data presentation unit sequences comprise a tation units over video presentation units by generating and 

video frame sequence including a plurality of intracoded releasing a virtual video frame to be presented as a default 

video frames. 30 presentation unit each time a next presentation unit is not 

16 The media data processor of claim 15 wherein each available for presentation and until the next presentation unit 

frame of the video frame sequence comprises an intracoded is available, the virtual video frame comprising a most 

video frame. recently presented video frame. 

17. The media data processor of claim 16 wherein (he 27. The media data processor of claim 14 wherein the 
video frame sequence comprises a motion JPEG video 35 retrieved media data presentation units are encoded in a 
sequence. storage code and compressed, and further comprising a 

18. The media data processor of claim 15 wherein each of presentation system for decoding the presentation units, 
the plurality of intracoded video frames comprises a key decompressing the presentation units, and converting the 
frame and is followed by a plurality of corresponding digitized presentation units to a corresponding analog rep- 
non-key frames, each key frame including media data infer- 40 resentation for presentation. 

mafion required for presentation of the following corre- 28. The media data processor of claim 12 wherein the 

spending non-key frames. media data interpreter maintains the current presentation 

19. The media data processor of claim 16 wherein the time at a prespecified time rate such that presentation units 
specified media data presentation unit sequences comprise a of the specified presentation sequences are each presented 
motion JPEG video sequence and an audio sequence. 45 for a presentation duration different than the prespecified 

20. The media processor of claim 14 wherein the media presentation duration. 

data interpreter further determines for each specified pre- 29. The media data processor of claim 12 wherein the 

sentation unit sequence availability of a next presentation media data interpreter monitors for each specified presen- 

descriptor when based on said comparison a next presents tation unit sequence an actual presentation rate of the 
tion unit should be released for presentation, and based on 50 presentation descriptors in that sequence released for 

that availability, generates and releases a virtual presentation presentation, compares the actual presentation rate with a 

unit of the prespecified presentation duration to be presented prespecified nominal presentation rate, and based on the 

as a default presentation unit each time a next presentation comparison, generates and releases a virtual presentation 

unit in that sequence is not available for presentation and unit of the prespecified presentation duration to be presented 
until the next presentation unit is available. 55 as a default presentation unit each time the monitored 

21. The media processor of claim 20 wherein the local presentation rate is greater than the prespecified presentation 
media data interpreter is adapted to monitor and identify a rate, and based on the comparison, skips over a presentation 
previously unavailable presentation unit when that descrip- "m* each time the monitored presentation rate is less man the 
tor is later available, and in response to identification of the prespecified presentation rate. 

later available descriptor, withholding the later available so 30. A computer-based method for controlling the timing 

presentation unit from release for presentation, whereby the of computer processing of digitized continuous time-based 

later available presentation unit is not presented. media data composed of a sequence of presentation units, 

22. The media data processor of claim 20 wherein the each unit characterized by a prespecified presentation dura- 
plurality of media data presentation unit sequences com- tion during a computer presentation of the media data, the 
prises an intracoded video sequence of video frames and an 65 method coniprising: 

audio sequence of audio samples, and wherein each virtual indicating a start time of presentation processing of the 

video presentation unit comprises a most recently presented media data presentation units; 
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maintaining a current presentation time as the media data descriptor comprising media data for one designated 

presentation unit sequence is processed for presenta- presentation unit in that sequence, each sequence of 

tion ; presentation descriptors being of a common media data 

counting each presentation unit in the presentation unit type; 

sequence after that presentation unit is processed for 5 associating each presentation descriptor with a corre- 

presentation, to maintain a current presentation unit sponding presentation duration and presentation time, 

count; and based on the retrieved media data; 

comparing a product of the presentation unit duration and linking the presentation descriptors of each sequence to 

the current presentation unit count with the current establish a progression of presentation units in that 

presentation time after a presentation unit is processed 10 sequence; 

for presentation, and based on the comparison, releas- indicating a start time of presentation processing of the 
ing a presentation unit next in the presentation unit presentation descriptor sequences- 
sequence to be processed for presentation when the a presentation time as the sequences 
product matches the current presentation time count, m for presentation; 
and deleting a presentation unit next in the presentation 13 % . . . t . 4 . , ' 
unit sequel when the product exceeds the current couaUn S Mcl L P'f- 1 *^ 0 " uut l ° *> e ■**» ^ 
presentation time count. sequences after that unit is processed for presentation. 
31 The media data processor of claim 30 wherein the tomamtainadistmctcurrentpresentationunitcouiitfor 
specified media data presentation unit sequence comprise a M SCG * ucncc » 

video frame sequence including a plurality of intracoded 20 comparing for each of the presentation unit sequences a 

video frames. product of the presentation unit duration and the current 

32. The media data processor of claim 31 wherein each presentation unit count of that sequence with the cur- 
frame of the video frame sequence comprises an intracoded rcnt Flirtation time after each presentation unit from 
video frame. that sequence is processed for presentation, and based 

33. The media data processor of claim 32 wherein the 25 on mc comparison, releasing a presentation unit next in 
video frame sequence comprises a motion JPEG video that presentation unit sequence to be processed for 
sequence. presentation when the product matches the current 

34. The media data processor of claim 31 wherein each of presentation time count, and deleting a presentation 
the plurality of intracoded video frames comprises a key unit ncxt m mat F cscntation unit sequence when the 
frame and is followed by a plurality of corresponding 30 product exceeds the current presentation time count 
non-key frames, each key frame including media data infer- M - "H* mcil>od of daim 3? whcrcin retrieved media 
mation required for presentation of the following corre- d** 8 comprises a storage presentation unit sequence com- 
sponding non-key frames. P 0 ^ of presentation units for the specified presentation unit 

35. The method of claim 30 further comprising: sequences, presentation units of the specified presentation 
determining the availability of a next presentation unit in 35 unit being alternately interleaved to compose the 

the presentation unit sequence to be processed, and storage presentation unit sequence 

based on that availability, generating and releasing a 39 - ^ method ° f claun 38 wherem the start time of 

virtual presentation unit of the prespedfied presenta- V"""*>» processing is common to all of die specified 

tion duration to be presented as a default presentation Potation unit sequences, whereby the specified presen- 

unit in place of the next presentation unit when a next 40 ^°n unit sequences are synchronously processed for simul- 

presentation unit is not available and until the next tw^ynchronous presentation. 

presentation unit is available. ^ Thc medla pro««* of claim 39 wherein the 

36. The method of claim 3S further comprising: I««»^ on u ^,^ n «* « ,m P ri « * 

. . - . ~ . .„ video frame sequence including a plurality of intracoded 

identifying a previously unavailable presentation unit 3 

t Jz * •* • i * m vi j 45 video frames, 

when that unit is later available; and ^ *i * * « , • .■_ • i_ 

, . , „ . , 41. The media data processor of claim 40 wherein each 

in response to me identification of the later available unit, frame ofthevidco sequence comprises an intracoded 

withholding the unit from release for presentation, video frame. 

whereby the later available unit is not presented. 42 . The media data processor of claim 1 wherein each of 

37. A computer-based method for controlling the com- x ^ lura% of ^deo frames comprises a key 
puter presentation of digitized continuous tune-based media ^ ud h foUowed ^ g rf cotr ^ 
data composed of a sequence of presentation units, each unit ^ ^ mcdia ^ infor ! 
characterized by a prespecified presentation duration and ^ for of Ae foUowin g corre- 
preseDtation time during a computer presentation of the sponding non-key frames. 

media data andfurther characterized as a distinct media data J3 43 ^ media ^ VMeenM of daim 41 wherein Ae 
type, me method comprising . specified media data presentation unit sequences comprise a 
retrieving media data from a computer storage location in mo tion JPEG video sequence and an audio sequence, 
response to a request for computer presentation of 44. A computer-based media data processor for control- 
specified presentation unit sequences ; Ung transmission of digitized media data in a packet switch- 
determining the media data type of each presentation unit ^ j n g network, the media data comprising a sequence of 
in the retrieved media data; continuous time-based presentation units, each unit charac- 
designating each retrieved presentation unit to a specified tehzed by a prespecified presentation duration and presen- 
media data presentation unit sequence based on the tation time during a computer presentation of the media data 
media data type ctaermination for that presentation and further characterized as a distinct media data type, the 
unit; 65 network comprising a plurality of client computer process- 
assembling a sequence of presentation descriptors for ing nodes interconnected via packet -based data distribution 
each of the specified presentation unit sequences, each channels, the media data processor comprising: 
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remote media data controller far receiving from a client synchronizing presentation of the specified presenta- 

processing node a request for presentation of specified tion unit sequences with the current presentation time, 

presentation unit sequences; 45. The media data processor of claim 44 wherein the 

remote media data input manager for receiving from the specified media data presentation unit sequences comprise a 

remote media data controller an indication of the sped- * video frame sequence including a plurality of intracoded 

fied presentation unit sequences, and in response to the video frames. 

request retrieving media data from a corresponding 46. The media data processor of claim 45 wherein each 

media access location, determining the media data type of toe video frame sequence comprises an intracoded 

of each presentation unit in the retrieved media data, video frame. 

designating each retrieved presentation unit to a speci- i° 47. The media data processor of claim 46 wherein the 

fied media data presentation unit sequence based on the video frame sequence comprises a motion JPEG video 

media data type determination for that presentation sequence. 

unit, assembling a sequence of presentation descriptors 48. The media data processor of claim 45 wherein each of 

for each of the specified presentation unit sequences, me plurality of intracoded video frames comprises a key 

each descriptor comprising media data for one desig- « frame and is followed by a plurality of corresponding 

nated presentation unit in that sequence, all presenta- non-key frames, each key frame including media data ihfor- 

tion descriptors in an assembled sequence being of a rnation required for presentation of the following corre- 

common media data type, associating each presentation sponding non-key frames. 

descriptor with a corresponding presentation duration 49. The media data processor of claim 45 wherein the 

and presentation time, based on the retrieved media 20 specified presentation unit sequences comprise a motion 

data, and linking toe descriptors in each assembled JPEG video sequence and an audio sequence, 

sequence to establish a profession of presentation units The media processor of claim 44 wherein the 

in each of the specified presentation unit sequences; media access location comprises a computer storage ioca- 

remote network media data manager connected to the t,0 Jl _ . A * 1 • rA . . ^ 

remote media data input manager for accepting from 25 51 J 1 * ^Processor of claim SOwherm the 

the remote media data^nanager me assembledlpecined st0 "* e , «f hon oom P nse / » COm ^ ter ^. 

presentation descriptor s^ueoces one presentation , S * ™ e If** 880 * of . daun 44 . * e 
, local media data interpreter synchronizes presentation of the 
descnpt« at a .tune, assembling [ transnasaon presen- s ^^ pt ^ atitioa ^ t ^ t ^ ib y^ B ^gf <>rtacih 
tation unit packets each composed of at least a portion »^m™p»™i.u«" _. .„ , J —I- 
of a presentation descriptoT md its media data, all 30 f * e F^enttttaii <^nptors in ea«* of the presentation 
presentation descriptors and media data in an tocrirrfor sequences toe preseason ttae corresponding to 
assembled packet being of a common media data type. <*^tor with the currently maintained presentation 
and releasing the assembled packets for transmission *»* *»* bas f* 00 companson releasing a next sequen- 
viathe network to the chent processing node requesting Presentation unit to be processed for presentaUon when 
presentation of the specified presentation unit 35 the ^sponding presentoUon hme of that descriptor 
se en matches toe current presentation tune, and deleting a next 
, \ „ , . . ' sequential presentation unit to be processed for presentation 
local media data controller for ^transmitting the request ^ me c ^ ent pttstQtsdoa ^ exceeds the correspond- 
for presentation of specified presentation unit m presentation time of that descriptor, 
sequences from the client processing node to the ^ S3 The mcdia ^ processor of daim 44 wherein the 
remote media data controller via the network and local media data interpreter synchronizes presentation of the 
controlling starting and stopping of sequence presen- specified presentation unit sequences by counting each pre- 
tation in response to user specifiedattons; sentation unit in the sequences after that presentation unit is 
local network media data manager for receiving from released to be processed for presentation, to maintain a 
the local media data controller an indication of toe 45 distinct current presentation unit count for each sequence, 
specified presentation unit sequences, receiving toe comparing for each of the presentation unit sequences a 
transrnission presentation unit packrts transniitted from product of toe presentation unit duration and the current 
the remote network media data manager via toe presentation unit count of that sequence with the currently 
network, designating a presentation unit sequence for maintained presentation time after a presentation unit from 
each presentation descriptor and its media in toe 50 that sequence is released to be processed for presentation, 
received packets to thereby assemble the presentation and based on toe comparison, releasing a next sequential 
descriptor sequences each corresponding to one speci- presentation unit in that presentation unit sequence when toe 
fied presentation unit sequence, all presentation product matches the currently maintained presentation time, 
descriptors and media data in an assembled sequence and deleting a next sequential presentation unit in that 
being of a common media data type, and linking the 55 presentation unit sequence when the product exceeds the 
descriptors in each assembled sequence to establish a currently maintained presentation time, 
progression of presentation units for each of the pre- 54. The media data processor of claim 52 wherein the 
sentation unit sequences; and local media data interpreter determines for each presentation 
local media data interpreter, connected to the local descriptor sequence availability of a next sequential presen - 
network media data manager, for accepting toe 60 tation descriptor in that sequence when the currently main- 
assembled presentation descriptor sequences one tained presentation time indicates that a presentation unit 
descriptor at a time and releasing the sequences for should be released for presentation, and based on that 
presentation one presentation unit at a time, indicating availability, generates and releases a virtual presentation unit 
a start time of presentation processing of the sequences, of the corresponding presentation duration to be presented as 
maintaining a current presentation time as the 65 a default presentation unit each time a next presentation 
sequences are processed for presentation, and based on descriptor in that sequence is not available and until a next 
the presentation duration of each presentation unit presentation descriptor is available. 
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55. The media data processor of claim 53 wherein (he 
local media data interpreter determines for each presentation 
descriptor sequence availability of a next sequential presen- 
tation descriptor in that sequence when based on said 
comparison a presentation unit should be released for 
presentation, and based on that availability, generates and 
releases a virtual presentation unit of the corresponding 
presentation duration to be presented as a default presenta- 
tion unit each time a next presentation descriptor in that 
sequence is not available and until a next presentation 
descriptor is available. 

56. The media data processor of either of claims 54 or 55 
wherein the local media data interpreter is adapted to 
monitor and identify a previously unavailable presentation 
descriptor when that descriptor is later available, and in 
response to identification of the later available descriptor, 
withholding the later available presentation unit from release 
for presentation, whereby the later available unit is not 
presented. 

57. The media data processor of either of claims 54 or 55 
wherein the specified presentation unit sequences comprises 
a motion video sequence of video frames and an audio 
sequence of audio samples, and wherein each virtual video 
presentation unit comprises a most recently presented video 
frame and each virtual audio presentation unit comprises 
silent audio samples. 

58. The media data processor of either of claims 54 or 55 
wherein the specified presentation unit sequences comprise 
an audio sequence and a video frame sequence composed of 
a plurality of key video frames, each key frame followed by 
a plurality of corresponding non-key frames, each key frame 
including media data information required for presentation 
of the following corresponding non-key frames, and wherein 
the local media data interpreter is adapted to monitor and 
identify a previously unavailable presentation descriptor 
corresponding to a key frame when that descriptor is later 
available, and in response to identification of the later 
available key frame descriptor, withholding the descriptor 
and any following descriptors, corresponding to non-key 
frames following the key frame, from release for 
presentation, whereby the later available key frame and 
following non-key frames are not presented. 

59. The media data processor of claim 50 wherein the 
media data retrieved by the remote media data input man- 
ager comprises a plurality of storage presentation unit 
sequences, each storage presentation unit sequence com- 
posed of presentation units for a specified presentation unit 
sequence and all presentation units in a storage presentation 
unit sequence being of a common media data type, and 
wherein the start time of presentation processing indicated 
by the local media data interpreter is common to all of the 
specified presentation descriptor sequences, whereby me 
presentation unit sequences are synchronously processed for 
simultaneous synchronous presentation. 

60. The media data processor of claim 50 wherein the 
network comprises a local area network. 

61. The media data processor of claim 50 wherein the 
network comprises a wide area network. 

62. The media data processor of claim 60 wherein the 
remote media data controller advertises to client computer 
processing nodes, via the network, an indication of specified 
presentation unit sequences that may be requested from that 
remote media data controller. 

63. The media data processor of claim 44 wherein the 
media access location comprises a digitized representation 
of analog media data captured in real time. 

.64. The media data processor of claim 44 wherein the 
media access location comprises a PBX server. 
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65. The media data processor of claim 44 wherein pre- 
sentation of the specified presentation unit sequences com- 
prises display of the presentation unit sequences. 

66. The media data processor of claim 44 wherein pre- 
5 sentation of the specified presentation unit sequences com- 
prises VCR tape printing of the presentation unit sequences. 

67. The media data processor of claim 65 wherein display 
of the presentation unit sequences comprises display on a 
computer monitor. 

10 68. The media data processor of claim 65 wherein display 
of the presentation unit sequences comprises display on a 
television monitor. 

69. The media data processor of claim 44 wherein pre- 
sentation of the specified presentation unit sequences com- 

15 prises recording the sequences at a computer storage loca- 
tion. 

70. The media data processor of claim 44 wherein pre- 
sentation of the specified presentation unit sequences com- 
prises sending the sequences to a PBX server. 

20 71. The media data processor of claim 44 wherein the 
media access location comprises an access point to a public 
switch network. 

72. The media data processor of claim 44 wherein pre- 
sentation of the specified presentation unit sequences com- 

23 prises sending the sequences to an access point in a public 
switch network. 

73. The media data processor of claim 44 wherein the 
remote media data controller further receives from the local 
media data controller via the network an indication of a 

30 specified presentation data rate at which the specified pre- 
sentation unit sequences arc to be transmitted via the net- 
work to the client node, and in response, the remote media 
data controller indicates the specified presentation data rate 
to the remote media data input manager and the remote 
3 S media data network manager, 

further wherein the media data retrieved by the remote 
media data input manager comprises a plurality of 
storage presentation unit sequences stored in a com- 
puter storage location, each storage presentation unit 
40 sequence composed of presentation units correspond- 
ing to a specified presentation unit sequence and all 
presentation units in a storage presentation unit 
sequence being of a common media data type; and 
further wherein the remote media data input manager 
45 designates each of a portion of the presentation unit 
descriptors as the descriptor sequences are assembled, 
the portion including a number of descriptors based on 
the specified presentation data rate, each designated 
descriptor comprising null media data, to thereby com- 
50 pose the presentation descriptor sequences with only a 
portion of storage presentation unit media data, 
whereby the specified presentation unit sequences 
attain the specified presentation data rate of transmis- 
sion. 

55 74. The media data processor of claim 44 wherein the 
remote media data controller further receives from the local 
media data controller via the network an indication of a 
specified presentation data rate at which the specified pre- 
sentation unit sequences are to be transmitted via the net- 
60 work to the client node, and in response, the remote media 
data controller indicates the specified presentation data rate 
to the remote media data input manager and the remote 
media data network manager, 

further wherein the media data retrieved by the remote 
65 media data input manager comprises a storage presen- 
tation unit sequence stored in a computer storage 
location, presentation units of the storage presentation 
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unit sequence comprising alternately interleaved pre- 
sentation units from the specified presentation unit 
sequences; and 
further wherein the remote network media data manager 
designates each of a portion of the presentation descrip- 5 
tors as the transmission presentation unit packets are 
assembled the portion including a number of descrip- 
tors based on the specified presentation data rate, each 
designated descriptor comprising null media data, to 
thereby compose the transmission presentation unit 10 
packets with only a portion of specified sequence 
presentation unit media data, whereby the transmission 
presentation unit packets attain the specified presenta- 
tion data rate of transmission. 

75. The media data processor of either of claims 73 or 74 
wherein the specified presentation unit sequences comprise 13 
a motion video frame sequence including a plurality of 
intracoded video frames and an audio sequence. 

76. The media data processor of claim 73 wherein the 
specified presentation unit sequences include an audio 
sequence composed of audio presentation units having cor- 20 
responding audio storage presentation units; and 

wherein the portion of presentation units having a pre- 
sentation unit sequence designation includes all audio 
storage presentation units. 

77. The media data processor of claim 74 wherein the 25 
specified presentation unit sequences include an audio 
sequence composed of audio presentation units; and 

wherein the portion of presentation units having a trans- 
mission presentation unit packet designation includes 
all audio presentation units. 30 

78. The media data processor of cither of claims 73 or 74 
wherein the local media data controller receives from the 
client node a client user-specified indication of a specified 
presentation data rate at which the specified presentation 
unit sequences are to be transmitted to the client node. 35 

79. The media data processor of either of claims 73 or 74 
wherein the local network media data manager monitors 
availability of presentation descriptors as they are accepted 
by the local media data interpreter one descriptor at a time 
from the local network media data manager, and based on 40 
the availability, indicates the specified presentation data rate 

to the local media data controller for indication to the remote 
media data controller. 

80. The media data processor of claim 79 wherein the 
local network media data manager indicates a specified 45 
presentation data rate that is higher than a current presen- 
tation unit sequence transmission rate when the monitored 
availability increases to prespecified upper availability. 

81. The media data processor of claim 79 wherein the 
local network media data manager indicates a specified 50 
presentation data rate that is lower than a current presenta- 
tion unit sequence transmission rate when the monitored 
availability decreases to a prespecified lower availability. 

82. A method for controlling transmission of digitized 
media data in a packet switching network, the media data 55 
comprising a sequence of continuous time-based presenta- 
tion units, each unit characterized by a prespecified presen- 
tation duration and presentation time during a computer 
presentation of the media data and further characterized as 

a distinct media data type, the network comprising a plu- 50 
rality of client computer processing nodes interconnected 
via packet-based data distribution channels., the method 
comprising: 

receiving from a client processing node a request for 
presentation of specified presentation unit sequences; 53 

in response to the request retrieving media data from a 
corresponding media access location; 
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determining the media data type of each presentation unit 

in the retrieved media data; 
designating each retrieved presentation unit to a specified 

media data presentation unit sequence based on the 

media data type determination for that presentation 

unit; 

assembling a sequence of presentation descriptors for 
each of the specified presentation unit sequences, each 
descriptor comprising media data for one designated 
presentation unit in that sequence, all presentation 
descriptors in an assembled sequence being of a com- 
mon media data type; 

associating each presentation descriptor with a corre- 
sponding presentation duration and presentation time, 
based on the retrieved media data; 

linking the descriptors in each assembled sequence to 
establish a progression of presentation units in each of 
the specified presentation unit sequences; 

assembling transmission presentation unit packets each 
composed of at least a portion of a presentation descrip- 
tor and its media data, all presentation descriptors and 
media data in an assembled packet being of a common 
media data type; and 

releasing the assembled packets for transmission via the 
network to the client processing node requesting pre- 
sentation of the specified presentation unit sequences. 

83. The method of claim 82 further comprising: 
receiving at the client processing node the transmission 

presentation unit packets via the network; 
designating a presentation unit sequence for each presen- 
tation descriptor and its media data in the received 
packets to thereby assemble the presentation descriptor 
sequences each corresponding to one specified presen- 
tation unit sequence, all presentation descriptors in an 
assembled sequence being of a common media data 

type; 

linking the descriptors in each assembled sequence to 
establish a progression of presentation units for each of 
the presentation unit sequences; 

indicating a start time of presentation processing of the 
sequences; 

maintainin g a current presentation time as the descriptor 
sequences are processed for presentation; and 

based on the presentation duration of each presentation 
unit, synchronizing presentation of the specified pre- 
sentation unit sequences with the current presentation 
time. 

84. The method of claim 82 wherein the specified pre- 
sentation unit sequences comprise an intracoded video frame 
sequence and an audio sequence. 

85. The method of claim 83 wherein the step of synchro- 
nizing presentation of the specified presentation unit 
sequences comprises: 

comparing for each of the presentation descriptors in each 
of the presentation descriptor sequences the presenta- 
tion time corresponding to that descriptor with the 
currently maintained presentation time; and 

based on the comparison, releasing a next sequential 
presentation unit to be processed for presentation when 
the corresponding presentation time of that descriptor 
matches the current presentation time, and deleting a 
next sequential presentation unit to be processed for 
presentation when the current presentation time 
exceeds the corresponding presentation time of that 
descriptor. 
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86. The method of claim 83 wherein the step of synchro- 
nizing presentation of the specified presentation unit 
sequences comprises: 

counting each presentation descriptor in the sequences 
after that presentation unit is released to be processed 5 
for presentation, to maintain a distinct current presen- 
tation unit count for each sequence; 

comparing for each of the presentation unit sequences a 
product of the presentation unit duration and the current 
presentation descriptor count of that sequence with the l0 
currently maintained presentation time after a presen- 
tation unit from that sequence is released to be pro- 
cessed for presentation; and 

based on the comparison, releasing a next sequential 
presentation unit in that presentation unit sequence 15 
when the product matches the currently maintained 
presentation time, and deleting a next sequential pre- 
sentation unit in that presentation unit sequence when 
the product exceeds die currently maintained presen- 
tation time. ^ 

87. The method of claim 83 further comprising: 
receiving via the network an indication of a specified 

presentation data rate at which the specified presenta- 
tion unit sequences are to be transmitted via the net- 
work to the client node, further wherein the media data ^ 
retrieved comprises a plurality of storage presentation 
unit sequences stored in a computer storage location, 
each storage presentation unit sequence composed of 
presentation units corresponding to a specified presen- 
tation unit sequence and all presentation units in a 3Q 
storage presentation unit sequence being of a common 
media data type; and 
designating each of a portion of the presentation unit 
descriptors as the descriptor sequences are assembled, 
the portion including a number of descriptors based on 35 
the specified presentation data rate, each designated 
descriptor comprising null media data, to thereby com- 
pose the presentation descriptor sequences with only a 
portion of storage presentation unit media data, 
whereby the specified presentation unit sequences ^ 
attain the specified presentation data rate of transmis- 
sion. 

88. The method of claim 83 further comprising: 
receiving via the network an indication of a specified 

presentation data rate at which the specified presents- 43 
tion unit sequences are to be transmitted via the net- 
work to the client node, further wherein the media data 
retrieved comprises a storage presentation unit 
sequence stored in a computer storage location, pre- 
sentation units of the storage presentation unit sequence 50 
comprising alternately interleaved presentation units 
from the specified presentation unit sequences; and 
designating each of a portion of the presentation descrip- 
tors as the presentation descriptor sequences are 
assembled, the portion including a number of de scrip- 55 
tors based on the specified presentation data rate, each 
designated descriptor comprising null media data, to 
thereby compose the transmission presentation unit 
packets with only a portion of specified sequence 
presentation unit media data, whereby the transmission go 
presentation unit packets attain the specified presenta- 
tion data rate of transmission. 

89. The method of either of claims 87 or 88 further 
comprising: 

monitoring availability of presentation descriptors after 65 
the descriptors are received at the client node and 
before the descriptors are presented; and 
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based on the availability, indicating the specified presen- 
tation data rate via the network. 

90. A computer-based media data processor for capturing 
and controlling transmission of digitized media data in a 
packet switching network, the media data comprising a 
sequence of continuous time-based presentation units, each 
unit characterized by a prespecified presentation duration 
and presentation time during a computer presentation of the 
media data and further characterized as a distinct media data 
type, the network comprising a plurality of client computer 
processing nodes interconnected via packet-based data dis- 
tribution channels, the media data processor comprising: 

a local media data controller for indicating user-specified 
presentation unit sequences to be captured from a client 
node for recording at a network media access location; 
a local media data interpreter for receiving the specified 
presentation unit sequences from the client node, 
assembling a sequence of presentation descriptors for 
each of the received specified presentation unit 
sequences, each descriptor comprising media data for 
one presentation unit in that sequence, all presentation 
descriptors in an assembled sequence being of a com- 
mon media data type, associating each presentation 
descriptor with a corresponding presentation duration 
and presentation time, based on the retrieved media 
data, and linking the descriptors in each assembled 
sequence to establish a progression of presentation 
units for each of the presentation unit sequences; 
a local network media data manager connected to the 
local media data interpreter, for accepting from the 
local media data interpreter the assembled specified 
presentation descriptor sequences one presentation 
descriptor at a time, assembling transmission presen- 
tation unit packets each composed of at least a portion 
of a presentation descriptor and its media data* all 
presentation descriptors and media data in an 
assembled packet being of a common media data type, 
and releasing the assembled packets for transmission 
via the network to the network media access location; 
a remote media data controller for receiving from the 
local media data controller an indication of the speci- 
fied presentation unit sequences to be recorded at the 
network media access location; 
a remote network media data manager for receiving from 
the remote media data controller an indication of the 
specified presentation unit sequences, receiving the 
transmission presentation unit packets transmitted from 
the local network media data manager via the network, 
designating a presentation unit sequence for each pre- 
sentation descriptor and its media data in the received 
packets to thereby assemble the presentation descriptor 
sequences each corresponding to one specified presen- 
tation unit sequence, all presentation descriptors and 
media data in an assembled sequence being of a com- 
mon media data type, and linking the descriptors in 
each sequence to establish a progression of presentation 
units for each of the presentation unit sequences; and 
a remote media data output manager for receiving from 
the remote media data controller an indication of the 
specified presentation unit sequences, and connected to 
the remote network media data manager, for accepting 
the assembled presentation descriptor sequences one 
descriptor at a time, formatting the accepted sequences 
and media data in a media access format, and releasing 
the formatted sequences to the media access location. 

91. The media processor of claim 9# wherein the media 
:cess location comprises a computer storage location. 
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92. The media processor of claim 91 wherein the com- 
puter storage location comprises computer file. 

93. The media processor of claim 90 wherein the specified 
presentation unit sequences comprise an intracoded video 
frame sequence and an audio sequence. 5 

94. The media processor of claim 93 wherein the media 
access location comprises a computer file. 

95. The media processor of claim 94 wherein the media 
access format comprises a storage presentation unit 
sequence, presentation units of the storage presentation unit lQ 
sequence comprising alternately interleaved presentation 
units from the specified presentation unit sequences. 

96. The media processor of claim 94 wherein the media 
access format comprises a plurality of storage presentation 
unit sequences, each storage presentation unit sequence 
composed of presentation units for a specified presentation 13 
unit sequence and all presentation units in a storage presen- 
tation unit sequence being of a common media data type. 

97. The media processor of claim 93 wherein the media 
access location comprises a VCR tape printer. 

98. A computer-based method for capturing and control- 20 
ling transmission of digitized media data in a packet switch- 
ing network, the media data comprising a sequence of 
continuous time-based presentation units , each unit charac- 
terized by a pre specified presentation duration and presen- 
tation time during a computer presentation of the media data 25 
and further characterized as a distinct media data type, the 
network comprising a plurality of client computer process- 
ing nodes interconnected via packet-based data distribution 
channels, the method comprising: 

indicating user-specified presentation unit sequences to be 30 
captured from a client node for recording at a network 
media access location; 

receiving the specified presentation unit sequences from 
the client node; 

assembling a sequence of presentation descriptors far 35 
each of the received specified presentation unit 
sequences* each descriptor comprising media data for 
one presentation unit in that sequence, all presentation 
descriptors in an assembled sequence being of a com- 
mon media data type; 40 

associating each presentation descriptor with a corre- 
sponding presentation duration and presentation time, 
based on the retrieved media data; 

linking the descriptors in each assembled sequence to ^ 
establish a progression of presentation units for each of 
the presentation unit sequences; 

assembling transmission presentation unit packets each 
composed of at least a portion of a presentation descrip- 
tor and its media data, all presentation descriptors and ^ 
media data in an assembled packet being of a common 
media data type; and 

releasing the assembled packets for transmission via the 
network to the network media access location. 

99. The method of claim 98 further comprising: 55 
receiving the transmission presentation unit packets trans- 
mitted via the network; 

designating a presentation unit sequence for each presen- 
tation descriptor and media data in the received packets 
to thereby assemble the presentation descriptor 60 
sequences each corresponding to one specified presen- 
tation unit sequence, all presentation descriptors in an 
assembled sequence being of a common media data 
type; 

linking the descriptors in each sequence to establish a 65 
progression of presentation units far each of the pre- 
sentation unit sequences; 



formatting the accepted sequences and media data in a 

media access format; and 
releasing the formatted sequences to the media access 

location. 

100. The method o claim 99 wherein the media access 
location comprises a computer storage location. 

101. The method of claim 100 wherein the computer 
storage location comprises computer file. 

102. The method of claim 100 wherein the specified 
presentation unit sequences comprise an intracoded video 
frame sequence and an audio sequence. 

103. A computer-based media data processor for control- 
ling the computer presentation of digitized continuous time- 
based media data composed of a sequence of presentation 
units, each unit characterized by a prespecified presentation 
duration and presentation time during a computer presenta- 
tion of the media data and further characterized as a distinct 
media data type, the media data processor comprising: 

a media data input manager for retrieving media data from 
a corresponding media data access location in response 
to a request for computer presentation of specified 
presentation unit sequences, determining the media 
data type of each presentation unit in the retrieved 
media data, designating each retrieved presentation unit 
to a specified media data presentation unit sequence 
based on the media data type determination for that 
presentation unit, assembling a sequence of presenta- 
tion descriptors for each of the specified presentation 
unit sequences, each presentation descriptor compris- 
ing media data for one designated presentation unit in 
that sequence, all presentation descriptors in an 
assembled sequence being of a common media data 
type, and linking the presentation descriptors in each 
assembled sequence to establish a progression of pre- 
sentation units in each of the sequences; and 

a media data interpreter, connected to the media data input 
manager, for accepting from the media data input 
manager the assembled presentation descriptor 
sequences one descriptor at a time and releasing the 
sequences for presentation one presentation unit at a 
time, indicating a start time of presentation processing 
of the presentation unit sequences, maintaining a cur- 
rent presentation time as the sequences are processed 
for presentation, counting each unit in the sequences 
after that unit is released to be processed for 
presentation, to maintain a distinct current presentation 
unit count for each sequence, comparing for each of the 
presentatioo unit sequences a product of the presenta- 
tion unit duration and the current presentation unit 
count of that sequence with the currently maintained 
presentation time after each unit from that sequence Is 
processed for presentation, and based on the 
comparison, releasing for presentation processing a 
next sequential presentation unit in that sequence when 
the product matches the currently maintained presen- 
tation time count, and deleting a next sequential pre- 
sentation unit in that sequence when the product 
exceeds the currently maintained presentation time 
count 

104. A computer-based media data processor for control- 
ling transmission of digitized media data in a packet switch- 
ing network, the media data comprising a sequence of 
continuous time-based presentation units, each unit charac- 
terized by a prespecified presentation duration and presen- 
tation time during a computer presentation of the media data 
and further characterized as a distinct media data type, the 
network comprising a plurality of client computer process- 
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ing nodes interconnected via packet-based data distribution 
channels, the media data processor comprising: 
a remote media data controller for receiving from a client 
processing node a request for presentation of specified 
presentation unit sequences; 
a remote media data input manager for receiving from the 
media data controller an indication of the specified 
presentation unit sequences, and in response to the 
request, retrieving media data from a corresponding 
media access location, determining the media data type 
of each presentation unit in the retrieved media data, 
designating each retrieved presentation unit to a speci- 
fied media data presentation unit sequence based on the 
media data type determination for that presentation 
unit, assembling a sequence of presentation descriptors 
far each of the specified presentation unit sequences, 
each descriptor comprising media data for one desig- 
nated presentation unit in that sequence, all presenta- 
tion descriptors in an assembled sequence being of a 
common media data type, and linking the descriptors in 
each assembled sequence to establish a progression of 
presentation units in each of the specified presentation 
unit sequences; 
a remote network media data manager connected to the 
remote media data input manager, for accepting from 
the remote media data manager the assembled specified 
presentation descriptor sequences one presentation 
descriptor at a time, assembling transmission presen- 
tation unit packets each composed of at least a portion 
of a presentation descriptor and its media data, all 
presentation descriptors and media data in an 
assembled packet being of a common media data type, 
and releasing the assembled packets for transmission 
via the network to the client processing node requesting 
presentation of the specified presentation unit 
sequences; 

a local media data controller far transmitting the request 
for presentation of specified presentation unit 
sequences from the client processing node to the ^ 
remote media data controller via the network and 
controlling starting and stopping of sequence presen- 
tation in response to user specifiedations; 

a local network media data manager for receiving from 
the local media data controller an indication of the 45 
specified presentation unit sequences, receiving the 
transmission presentation unit packets transmitted from 
the remote network media data manager via the 
network, designating a presentation unit sequence for 
each presentation descriptor and media data in the 50 
received packets to thereby assemble the presentation 
descriptor sequences each corresponding to one speci- 
fied presentation unit sequence, all presentation 
descriptors and media data in an assembled sequence 
being of a common media data type, and linking the 55 
descriptors in each assembled sequence to establish a 
progression of presentation units for each of the pre- 
sentation unit sequences; and 

a local media data interpreter, connected to the local 
network media data manager, for accepting the 60 
assembled presentation descriptor sequences one 
descriptor at a time and releasing the sequences for 
presentation one unit at a time, indicating a start time 
of presentation processing of the sequences, maintain- 
ing a current presentation time as the descriptor 65 
sequences arc processed for presentation, and based on 
the presentation duration of each presentation unit, 
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synchronizing presentation of the specified presenta- 
tion unit sequences with the current presentation time. 
105. A computer-based media data processor for captur- 
ing and controlling transmission of digitized media data in 
a packet switching network, the media data comprising a 
sequence of continuous time-based presentation units, each 
unit characterized by a prespecified presentation duration 
and presentation time during a computer presentation of the 
media data and further characterized as a distinct media data 
type, the network comprising a plurality of client computer 
processing nodes interconnected via packet-based data dis- 
tribution channels, the media data processor comprising: 
a local media data controller for indicating user-specified 
presentation unit sequences to be captured from a client 
node for recording at a network media access location; 
a local media data interpreter for receiving the specified 
presentation unit sequences from the client node, 
assembling a sequence of presentation descriptors for 
each of the received specified presentation unit 
sequences, each descriptor comprising media data for 
one presentation unit in that sequence, all presentation 
descriptors in an assembled sequence being of a com- 
mon media data type, and linking the descriptors in 
each assembled sequence to establish a progression of 
presentation units for each of the presentation unit 
sequences; 

a local network media data manager connected to the 
local media data interpreter, for accepting from the 
local media data interpreter the assembled specified 
presentation descriptor sequences one presentation 
descriptor at a time, assembling transmission presen- 
tation unit packets each composed of at least a portion 
of a presentation descriptor and its media data, all 
presentation descriptors and media data in an 
assembled packet being of a common media data type, 
and releasing the assembled packets for transmission 
via the network to the network media access location; 
a remote media data controller for receiving from the 
local media data controller for receiving from the local 
media data controller an indication of the specified 
presentation unit sequences to be recorded at the net- 
work media access location; 
a remote network media data manager for receiving from 
the remote media data controller an indication of the 
specified presentation unit sequences, receiving the 
transmission presentation unit packets transmitted from 
the local network media data manager via the network, 
designating a presentation unit sequence for each pre- 
sentation descriptor and media data in the received 
packets to thereby assemble the presentation descriptor 
sequences each corresponding to one specified presen- 
tation unit sequence, all presentation descriptors in an 
assembled sequence being of a common media data 
type, and linking the descriptors in each sequence to 
establish a progression of presentation units for each of 
the presentation unit sequences; and 
a remote media data output manager for receiving from 
the remote media data controller an indication of the 
specified presentation unit sequences, and connected to 
the remote network media data manager, for accepting 
the assembled presentation descriptor sequences one 
descriptor at a time, formatting the accepted sequences 
and media data in a media access format and releasing 
the formatted sequences to the media access location. 

***** 
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IN THE DRAWINGS : Sheet 5, FIG. 6, reference number 36, the phrase "create stream manager" 
should read -create stream interpreter--; and the figure labeled "STREAM INTERPRETER" should 
have a reference number --28--. Sheet 6, FIG. 7, the "INTERLEAVED DISK BUFFERS" should be 
framed in a broken-line box labeled reference number -1 00-. IN THE SPECIFICATION : Column 
1, lines 61 and 64, each occurrence of the word "dock" should read -clock-. Column 2, lines 53-54, 
the phrase "when the product exceeds the current presentation time count" should read -when the 
current presentation time count exceeds the product—. Column 4, lines 25-26, the phrase "when the 
product exceeds the currently maintained presentation time" should read —when the currently 
maintained presentation time exceeds the product--. Column 10, line 66, the phrase "video frame 
108" should read —video frame 1 10—; and the phrase "audio frame 1 10" should read --audio frame 
108-. Column 12, line 41, the word "sell" should read -self-. Column 14, line 53, the word 
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units and unit duration exceeds the currently maintained time count" should read -when the currently 
maintained time count exceeds the product of processed units and unit duration-; line 34, the word 
"then" should read -when-; and line 48, the word "steam" should read -stream-. Column 19, line 
54, the number "82" should be -84-. Column 22, line 3, the number "176° should be —174—; and 
line 40, the word "some" should read —same—. Column 25, line 42, the word "the" should be deleted. 
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(57) ABSTRACT 

A DVD authoring system in a processor-based system 
removes an author from consideration of the DVD Specifi- 
cation during authoring. According to a preferred 
embodiment, the authoring system provides an aulll&ing 
engine havmg-an:interaetive,graphica^ a 
data management engineTah emulator, a compiler, a multi- 
plexer and a simulator. Using summary authoring data, the 
compiler builds a skeleton-form PGC layout structure com- 
prising control PGC abstractions and router PGC abstrac- 
tions. The compiler then resolves the PGC abstractions 
according to source-target connections. During playback on 
a DVD player, the PGC abstractions form elements in a 
connection-switching abstraction superstructure. 
Accordingly, in response to DVD -consumer and other con- 
trol events, a source PGC preferably determines target PGC 
information and then transfers control, via virtual connec- 
tions through necessary router PGC abstractions, to a target 
PGC abstraction. The target PGC abstraction then corre- 
spondingly initiates playback of a movie chapter or displays 
a menu. 
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MENU AUTHORING SYSTEM AND 
METHOD FOR AUTOMATICALLY 
PERFORMING LOW-LEVEL DVD 
CONFIGURATION FUNCTIONS AND 
THEREBY EASE AN AUTHOR'S JOB 

FIELD OF THE INVENTION 

The present invention relates generally to mass data 
storage and retrieval, and more particularly to apparatus and 
methods for authoring a digital versatile disk. 

BACKGROUND OF THE INVENTION 

New mass data storage means provide not only for storing 
greater amounts of multimedia and other information, but 
also for more interactive data retrieval by consumers. For 
example, one such storage means is espoused by the "DVD 
Specification for Read- Only Disc, Physical, File Format and 
Video Specifications" (DVD Consortium 1997), hereinafter 
referred to as the "DVD Specification". Other examples 
include further DVD-related technologies (e.g. DVD-Audio, 
DVD -RAM, etc.) as well as non-DVD technologies. 

The Physical and File System portions of the DVD 
Specification defines the physical encoding and organization 
of data for storage on read-only digital versatile disk ("DVD 
ROM") media. The Video portion of the DVD Specification 
defines a data set ("DVD-Video data set") with which 
pre-recorded DVD-Video discs must conform in order to 
assure proper reading, decoding and playback when inserted 
into a media reader/decoder ("DVD -player"). More 
specifically, the Video portion specifies how "control data" 
and audio/video "presentation data" are encoded and 
ordered within the data set. The control data determines how 
presentation of audio/video data will proceed when the disc 
is played back on a DVD -player and consists of low-level 
state information, data structures and instruction sets which 
govern what kinds of functions and user operations a DVD 
player can perform. 

The DVD Specification is further hereby fully incorpo- 
rated herein by reference as if repeated verbatim immedi- 
ately hereinafter. 

The process of encoding and authoring a DVD movie 
title, as currently practiced, includes a number of separate 
and distinct steps requiring similarly separate and distinct 
expertise. After movie production, raw film and/or video 
footage is edited, the soundtrack is edited and mixed, and a 
movie film or video master is created. This master is 
subsequently digitized, encoded as video and audio streams 
and stored as data files. In accordance with the DVD 
Specification, the Moving Pictures Expert Group ("MPEG-1 
or MPEG-2") format is used to encode the video streams and 
any one or more of a number of specified formats (e.g. 
MPEG-lor MPEG-2 Audio, Dolby AC-3, PCM) is used to 
encode the audio streams. Graphic data (i.e. still or moving 
images for creating menus and other presentation data) is 
also created and stored in conventional graphic files. Finally, 
authoring guidelines, the encoded audio and video stream 
files and the graphic files are gathered for the authoring 
phase. 

During authoring, a DVD author utilizes the guidelines 
and file information to construct a DVD movie-title. The 
authored movie-title determines what a user of a resultant 
movie title will see and hear, and what kinds of interactions 
the user can command when the movie title is played back 
by a DVD -player. The author organizes the video, audio and 
(often author-created) subtitle files, divides the movie into 
segments ("chapters"), creates menus, and specifies low- 
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level instructions. The low-level instructions will set 
parameters, define fixed or optional jump points and their 
destinations and determine the order and options by which 
playback of still pictures, movie chapters and associated 

5 audio tracks will proceed based on the user's menu selec- 
tions and/or use of other DVD -player controls (i.e. typically 
using a remote control device). 

Once authored, the author's organizational decisions, 
subtitle, chapter and menu decisions, and low-level instruc- 

10 tions are compiled into control data, and the encoded video, 
audio and subtitle streams, as well as the graphic data files, 
are multiplexed into presentation data, which together con- 
stitute the DVD-Video q"ata set. Finally, this DVD-Video 
data is converted into a "disc image layout" file, which can 

15 be used to bum a "rite-once DVD-R" disc, or can be stored 
onto a tape to send to a DVD-ROM manufacturing plant for 
creating a "master" disc, which can then be used for repli- 
cation. 

Conventional DVD authoring systems comprise a com- 

20 puter system running an application-specific DVD authoring 
program. An exemplary, widely used conventional DVD 
authoring system is Scenarist-II. 
Scenarist-H is essentially an attempted, nearly direct 

^ embodiment of the DVD Specification. Using Scenarist-II, 
an author organizes data streams, and constructs menus and 
DVD structures according to the DVD Specification. Top 
level structures (i.e. up to 99 "VTSs" and "VTSMs", a 
"VMG" and a "VMGM") are constructed by selecting the 

3Q structure type and then populating the structure with one or 
more low-level command segments ("program chains" or 
"PGCs") including movie or menu references. Throughout 
this process, the author also selects from among available 
data formats, as well as from among the numerous DVD 

35 options and requisite parameters, using a number of pro- 
vided lists and other data and parameter representations. 
Stated alternatively, all structures and PGC parameters, 
capabilities and references must be fully specified by the 
author on an ongoing basis during authoring. 

4Q Unfortunately, the DVD Specification is very complex, as 
are the conventional programs that attempt to embody it. 
Available options are extensive, as are the numerous listings 
of options and parameters within programs such as 
Scenarist-II. The potential combinations of structures and 

45 PGCs are also extensive, and many such combinations will 
not ultimately result in functional DVD movie-titles. 

To make matters more difficult, the PGCs (i.e. basic and 
frequent constructs of the DVD Specification and therefore 
of programs such as Scenarist-II) are counter-intuitive. 

50 Often, many PGCs (including both operative and so-called 
"dummy" PGCs) must be used in specific combinations to 
provide a DVD consumer with even the most basic control 
capabilities. Limitations imposed by the DVD Specification 
must also be considered throughout the process. Thus, errors 

55 in planning and/or programming might well remain unde- 
tected until after a substantial number of structures are 
formed. In addition, given the sheer number of structures, 
PGCs, commands, options and parameters involved, 
identifying, locating and correcting errors is difficult and 

so time-consuming. 

Consequently, while providing extensive low-level con- 
trol and an expedient authoring-to-compilation 
correspondence, conventional authoring systems require an 
extensive expertise with regard to both the DVD Specifica- 

65 tion and the authoring system itself. Further, even assuming 
such expertise, authoring is extremely time-consuming and 
is therefore typically very costly. In addition, even assuming 
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resolution of other factors, the time and expertise required abstractions corresponding to the number of authored movie 

would likely prevent authoring of even a preliminary movie- elements. The compiler then completes the layout structure 

title as a directorial aid during the movie production process. according to author-selected and default source-target con- 

A further disadvantage of conventional authoring systems nections. 

is that experimentation and all but necessary modification 5 Further according to a preferred embodiment, during 

are often compromised due to time and cost considerations. playback of a resultant DVD movie title, a source PGC 

Thus, many DVD movie titles (due to limited budget to abstraction is invoked in response to DVD player and/or 

support expensive authoring time) provide a DVD consumer consumer instructions. The source PGC abstraction deter- 

with only minimal playback control, navigation flexibility mines target information and transfers control, through 

and interactivity. io necessary router PGC abstractions, to a target PGC abstrac- 

Accordingly, there is a need for an authoring system and tion. The target, in accordance with the target information, 

method that enables DVD authoring in a manner removed plays a movie chapter, displays a menu, or sets and/or 

from the structures and low- level instruction sets of the modifies one or more DVD parameter. 

DVD Specification, thereby reducing the time, cost and These and other objects, advantages and benefits of the 

complexity of the authoring process. 15 present invention will become apparent from the drawings 

Inere is further a need for such an apparatus and method and specification that follow, 
whereby authoring can be conducted in an intuitive manner, 

while maximizing flexibility and access to features provided BRIEF DESCRIPTION OF THE DRAWINGS 

by or otherwise not in conflict with the DVD Specification. ^ p IG j ^ mnc ti 0 nal block diagram generally illustrating 

SUMMARY OF THE INVENTION m autnorm g system according to a preferred embodiment of 

the invention; 

The present invention provides a data processing-system fig. 2 is a functional block diagram illustrating in more 

based authoring system and method that essentially removes ^ a preferred authorin pr0 am of the authori syslem 

an author from consideration of the structures and low-level ghown ^ mQ x according to the invention; 

instruction sets of the DVD Specification. More specifically, - . ( 

the present authoring system removes the ordered tasks FIG. 3 is a screenshot of a preferred performance element 

associated with creating DVD structures and programming «™ngeinent interface portion of the FIG. 2 authoring 

PGCs, and replaces them instead with an interactive, intui- Prog™, according to the invention; 

tive and graphical authoring environment. FI( J. 4 is a blowup of the FIG. 3 screenshot showing, in 

The present invention further provides for flexible pro- 3 ° mor * a preferred authoring toolbar for accessing 

gram flow in response to control events. Many interactive authoring program modules and functions; 

controls, menu button destinations and other features that FIG. 5 is a flowchart illustrating an exemplary method 

are possible in accordance with the DVD Specification can used bv an author to create a performance element arrange- 

be specified by an author in multiple instances and according 35 rnent using the performance element arrangement interface 

to quick, intuitive and interactively modifiable selections. portion of FIG. 3; 

Thus the invention facilitates authoring of a DVD movie FIG. 6a is a flowchart illustrating preferred responses of 

title by even an inexperienced author with context sensitive the authoring program to authoring while the performance 

responsiveness to DVD consumer instructions and other element arrangement interface portion of FIG. 3 is active; 

DVD player-generated events. ^ FIG. 6b is a flowchart further illustrating preferred 

Accordingly, a preferred embodiment of the present responses of the authoring engine to authoring while the 

invention comprises an authoring engine having an inte- performance element arrangement interface portion of FIG. 

grated interface with which an author performs the above 3 is active; 

tasks a data management engine for storing and recalling FIG. 7 is a screenshot of a menu element layout interface 

authoring information, a simulator for viewing progressive 45 portion of the FIG. 2 authoring program, according to the 

and/or comparatively authored movie titles prior to invention* 

compiling, a compiler, a multiplexer and an emulator for FIG , s \ s a flowchart illustrating an exemplary method 

viewing authored movie tides after compiling and multi- used by an author to create a menu layout ^ menu 

P lexm S- element layout interface portion of FIG. 7; 

Included within and facilitating the ability of these ele- 5 0 FIG. 9 is a screenshot of a preferred connections interface 

ments to remove an author from the DVD Specification are don of ^ FIG 2 authori am> accordillg to ^ 

several abstractions. Preferably, the interface provides such invention- 

"user abstractions" as arranging movies (i.e. data streams .1. t . c j • i * ■ * * 

• i-j. ■ j iJ?*i * . • j j . i FIG. 10 is a screenshot of a preferred simulator interface 

including video, audio, subtitles,^chapter points and other - >, > u • j- * *u 

, t % J . .Zj- . „ portion of the FIG. 2 authoring program, according to the 

elements), creating menu ky^ outs (i.e. menus, menu buttons ss mvent j on . 

and still or moving jfflges^vith or without sound) and m ^,°I!l . r m , 

specifying connections among these arrangements and FIG * 11 15 a functional block diagram of a preferred data 
layouts, each in a simple and intuitive, yet highly flexible . management engine according to the invention; 
way. Further abstractions include a network or connection- FIG - 12fl k a flowchart showing generally the operation 
switching abstraction and a number of control and router eo of a preferred compiler according to the invention; 
PGC abstractions from which the connection-switching FIG. 126 is a flowchart showing how a compiler accord- 
abstraction is constructed. ing to the invention preferably constructs a skeleton-form 

Authoring instructions entered through the interface are PGC layout structure; 

preferably broken down into component parts and stored by FIG. 12c is a flowchart showing how the compiler pref- 

the data management engine. The invoked compiler, using 65 erably resolves source-target connections and substitutes 

only summary authoring information, preferably constructs those connections for null operations in a preferred skeleton- 

a skeleton form PGC layout structure comprised of PGC form PGC layout structure, according to the invention; 
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FIG. 13 is a block diagram showing the formal of a data management engine 165, compiler 170, simulator 175, 

preferred PGC layout structure according to the invention; emulator 180, multiplexer 185, output DVD data storage 

FIG. 14 is a functional block diagram showing a preferred £90 a nd layout formatter 187, user abstractions 285 and 

connection-switching abstraction according to the invention; PGC abstractlons 287 - 

+m • a L 1 c j r 5 It is discovered through examination of the features 

FIG. 15 is a flowchart showing a preferred operation of rted b DVD players that the basic presentation data 

the connection-switching abstraction of FIG. 14, according types and consumer available t0 M author of DVD 

to the invention; movie titles can be generalized and then reconstructed as 

DETAILED DESCRIPTION OF A PREFERRED abstracted user data types and controls. Further, despite the 

EMBODIMENT 10 complexity °f me DVD Specification, many of its program- 
ming constructs can also be generalized and then recon- 

For clarity sake, the discussed embodiment herein will be structed as abstracted DVD program chains ("PGCs") oper- 
directed primarily toward storage according to the DVD ating within a further abstracted network or connection- 
Specification, and more specifically at authoring motion switching superstructure. Such user abstractions 285 and 
picture DVD ROMS ("movie titles"). It should be 15 PGC abstractions 287, as integrated into authoring engine 
understood, however, that the present invention relates to a 160, data management engine 165 and compiler 170 (as 
broad range of program and data storage and retrieval illustrated), effectively remove an author using authoring 
utilizing a variety of media, only a subset of which will be program 201 from consideration of DVD Specification 205. 
specifically identified herein. The types of DVD ROMS These abstractions further remove such consideration with- 
which can be authored are further in no way limited to movie 2Q 0 ut unduly limiting, for most practical purposes, authoring 
titles. Other examples include but are not limited to music flexibility, PGC efficiency or interactive responsiveness of a 
videos, documentaries, educational videos, corporate resultant DVD-ROM, among other factors. In addition, 
training, medical applications and other continuous play or these abstractions provide a framework of re-useable corn- 
interactive information which utilizes audio, video and/or ponents that are readily adaptable to further modification for 
other presentation data. ^ providing improvements, and for re-use in a variety other 

As illustrated in FIG. 1, a preferred embodiment of DVD and non-DVD applications, 

authoring system 100 according to the invention preferably Authoring program 201 is preferably implemented in 

comprises electrically connected hardware elements includ- C++, an object-oriented language, for reliability, update abil- 

ing input devices 110, processor 115, memory 120, storage ity and other known generalized advantages of object- 

125, MPEG encoder/decoder 130, video I/O device 135 and 30 oriented programming. Those skilled in the computer arts 

audio I/O device 140. Authoring system 100 further com- will appreciate however, that despite such advantages, other 

prises software elements including operating system 150, environments and/or programming languages of various 

authoring engine 160, data management engine 165, com- object-oriented and no □ -object -oriented types can also be 

piler 170, simulator 175, emulator 180 and multiplexer 185. utilized. 

It will be apparent to those skilled in the art that several 35 Operationally, an author enters authoring information and 

variations of the authoring system elements are contem- instructions for activating and contro lling authoring pro^ 

plated and within the intended scope of the present inven- gram 201 through mterTa s cB^o^^^^^^^^^^oring engine^ 

tion. For example, given processor and computer perfor- 160. Authoring engine^rfiO^mteTacu^iy receives entered 

ma nee variations and ongoing technological advancements, information and commands by correspondingly adjusting 

hardware elements such as MPEG encoder/decoder 130 may 40 interface portion 160a, invoking a further authoring program 

be embodied in software or in a combination of hardware module, sehding^entered authoring information to dataman- 

and software. Similarly, software elements such as multi- agement engine 165, retrieving authored information from 

plexer 185 may be embodied in hardware or in a combina- data management engine 165, and sending and/or retrieving 

tion of hardware and software. Further, while connection to presentation data from presentation data storage 203. Data 

other computing devices is indicated as network I/O 145, 45 management engine 165 responds to authoring engine 160 

wired, wireless, modem and/or other connection or connec- by receiving and storing authored information from author- 

tions to other computing devices (including but not limited ing engine 160 and/or sending information, which it 

to local area networks, wide area networks and the internet) retrieves from storage (and/or from a remote source), to 

might be utilized. A further example is that the use of authoring engine 160. Simulator 175 responds to authoring 

distributed processing, multiple site viewing, information 50 engine 160 by retrieving authoring data from data manage- 

forwarding, collaboration, remote information retrieval and ment engine 165, retrieving multiplexed presentation data 

merging, and related capabilities are each contemplated. from multiplexer, and simulating an authored DVD-ROM in 

Various operating systems and data processing systems can conjunction with interfa^^Si 

also be utilized, however at least a conventional multitask- Compiler 170 responds to authoring engine 160 by 

ing operating system such as Windows95® or Windows 55 retrieving authored information from data management 

NT® (trademarks of Microsoft, Inc.) running on an IBM® engine 165, compiling the information and storing the 

(trademark to International Business Machines) compatible compiled information (".ifo files") in output DVD data 

computer is preferred and will be presumed for the discus- storage 290. Emulator 180 responds to authoring engine 160 

sion herein. Input devices 110 can comprise any number of by retrieving compiled data from output DVD data storage 

devices and/or device types for inputting commands and/or 60 290, retrieving multiplexed data from output DVD data 

data, including but not limited to a keyboard, mouse, and/or storage 290 and emulating an authored DVD-ROM in 

speech recognition. (The use of a keyboard and a mouse are conjunction with interface 160a. Multiplexer 185 responds 

exemplified throughout the discussion that follows.) to authoring engine 160 by receiving DVD parameter infor- 

The FIG. 2 block diagram illustrates in greater functional mation from compiler 170, retrieving presentation data from 

detail an authoring program 201 of the preferred authoring 65 presentation data storage 203 and combining the retrieved 

system of FIG. 1. As shown, authoring program 201 com- information and data in accordance DVD Specification 205. 

prises authoring engine 160 (which includes interface 160a), Multiplexer 185 then stores the combined information and 
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data ("DVD data stream" or ".vob file") in output DVD data and selected from presentation data panel 301 for arrange- 

storage 290. Layout formatter 187 retrieves the .vob files ment purposes. Up to eight (alternate language) audio data 

and ifo files from output DVD data storage 290 and com- streams or audio "tracks", exemplified by audio tracks 331a 

bines these files into a single "disc image" file, which it then through 331c, are available in accordance with DVD Speci- 

stores in disc image file storage 207. The disc image file can 5 fication 205 (FIG. 2)^ Audio, bars. 332a and 332b, which 

then be sent through network I/O 145 (FIG. 1) to additional represent author-arranged audio, chpsr~have^lengtrr that 

apparatus for further review, processing and/or for burning reflects th^playbackjiine of the audio data jepTesented. 

one or more DVD-ROMs 207. Separators 333 are further indicators of "chapter points as 

„„„ „ , , „^ . , „ m „ „ .„ j , with video frame-tnumbnails~323arand 323ft -or -video 

FIGS. 3 through 10, with reference to FIG. 2, illustrate dSssnA> x y portfon-SM^Audio tracks 332^ through "332c 

how an interface according to the invention enables an W ^her include^udia^dmg indicators 33Ja, audio for- 

author to assemble a movie title essentially removed from mat indicators 334b, track numberrindicators 335' and 

DVD programming specifications 207 (FIG. 2) of the DVD selected^anguag^indicators 336, which are indicative 

Specification. Preferred interface 160a is illustrated as an respectively of ~ audio data file encoding and playback 

application running under a Windows95® or Windows NT® format, selectable audio track number 336 and modifiable 

(trademark of Microsoft, Corp.) operating system. 1 5 language label 335. Language labels -335 can be set by 

The FIG. 3 screenshot illustrates a preferred authoring author selection or, as is expectedrautomatically by recog- 

window 300, which an^author can utilize to select an nition of languages spoken in a recorded dialog of a respec- 

arrangement of audio-visual j material -including video seg- ^ve audio trade 

ments (Vvideo chVVa^^gments ("audio-clips?) and Subtitle assembly portion 340 provides for entry, retrieval 

subtities\hereinafter referred to collectively"© "perfor- 20 f and ^ r e ^S of U P to Italy-two (alternate language) frame- 

mance data") based subtitle sequences, as exemplified by tracks 341a and 

. , . ' , , ^^ft.j-.jj. i_, 341b. Exemplary subtitle frames 342a and 342b illustrate 

Authoring window 300 is divided into movable modifi- textual subtMe Suhm&s are eQtered - n a 

able and replaceable groupings or views and panels tional manner using a conventional text editor (not shown) 

including presentation data panel 301, performance assem- ^ which ^ i nvo ked by activating a subtitle frame (e.g. by menu 

bly panel 302, assembled elements panel 307, log panel 308 selection or double-clicking) and/or by retrieving a pre- 

and preview video panel 309. Assembly panel302 is further existing subtitle file using, for example, presentation data 

divided into video assembly^portioE^O^audio^ssembly panel 301. As with audio assembly portion 330, subtitle 

portion"330;and subtitle ^sembl y rx)rtion:340 (which are portion 340 mc i u d es selectable track numbers and modifi- 

collectively-referred to herein as performance view 303), 3Q able language label indicators. 

and performance tools portion 360. Authoring window 300 p er f ormance assembly view 303 also includes chapter 

also mciudes authoring toolbar 399a and menu bar 399b. For assembly portion 350, which is used by an author to graphi- 

clarity sake, the following discussion assumes that a single, c ^ [y ^ d ^interactively- assembl^chapte^points. Chapter 

continuous movie is being authored (i.e. a movie having assembly portion 350^cludes wall~clock^351, reference 

opponent vide^au±o r and^subtitle^data strearr^ each of 35 ofet clock 352j aum o r -assembled chapter indicators 353a 

which <begins_at the start of the movie and ends ^tt* ^ QUgh 3 53c X ch £ pter time mdicators 3s4a through 354c 

conclusion of the movie). and reference^time inScaTors 355a through 355c. Wall clock 

Presentation data panel 301 provides a display listing for 351 mmcatesca:time within a video-clip cone^onding-to a 
each presentation data^file -that lin^author, has se lected and cursor^position.over cha^terportion 350 of assembly panel 
loaded for use in assembling movies and -menus either ^ 302^Ofiset clodc-352:indicates:the start time of a currently 
during a current authoring session or when continuing a indicated video clip^cordmg'to^ of 
re-initiated, prionauthoring session. File listings include file a, master tape (i.e> from- winch the video^data file was 
name 311, file duration 313, and file type 315^parameters. created). Chapter indicators 353a through 353c show chap- 
File name 311 lists the name of a file. File duration 313 lists / t er points (i.e. points to which a DVD-ROM consumer can 
the playback duration of nUes^such as^deo data files and 45 advance) as arranged during authoring. Chapter time indi- 
audio data files. FuVtype 315 alternatively fists a file format,/ cators 35^ through 354c and reference time indicators 355a 
which is generally indicated by a filename extension, o^a through 355c display the ^psed~time~ of corresponding 
recognized data type such as "video" data orc^audio^data. selected chapter pomta^thej^^ and from 
As will be further discussed, presentation data file listings the start of axlip respecUyely^Reference : times:are:typically 
can be used interactively during an authoring session. 50 recordedr;(and thus can be selectively retrieved and . 

Performance assembly view 303 of performance assem- displayed) utilizing Society of Motion Pictures and Televi- 

bly panel 302_is^used by an author to graphically-and sion ("SMPTE") timecode. 

mteractivel^a"ssemble loaded video and/or jlu&o data, to As noted earlier, performance assembly panel 302 and the 

add and assemble subtitles, and/or to add chapter points. For other panels and views of authoring window 300 are 
these purposes^perfonnance^iew3 303 includes video^ss replaceable. Tabs 302a provide one alternative control struc- 

assembly : rwrtiott320, 330, subtitle ture for selectively switching between initiated or "open" 

assembly portion 340 and chapter^isselnbly portion 350 authoring tasks, for example, to alternate between assem- 

respectively. Video assembly^portion 320 is used by an bling presentation data of multiple movies, for creating 

author to assemblejgraphic objects referencing stored video menu layouts, and/or for other authoring tasks. Other control 

data files ("v^eo^clips"). As discussed, these files, once 60 structures include menu options (not shown) for selectively 

midafiy^ected^arejisted in presentation data panel 301. de-coupling panels and transport enabling controls (362a 

Videojyftame^ and 323b are indicative of through 362c and 363a through 363Z>), and further for 

chapter points_ as w ill.be further discussed herein. re-coupling in the illustrated default arrangement, in an 

I Audio assembly portion 330 of performance assembly author-selectable arrangement and/or interactively by an 
panel3^xis jused[ by_ an author to receive graphic- objects TjS5 author. Panels can be resized and/or re-arranged among 

referencing'stored- audio data files ("audio:clips ,J )r-As with other window capabilities, as will be understood by those 

video clips, audio clips, once selected foruse^are listed in skilled in the art in view of the discussion herein. 
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Assembly tools portion 360 of performance assembly 
panel 302 comprises selectable zoom controls 361a through 
361c, preview transport buttons including stop 362a, play 
3626 and frame advance 362c, preview transport start time 
selector 363a and stop time selector 3636, selected clip 
indicator 364a and total clips indicator 3646. Zoom controls 
361a through 361c are used respectively for increasing the 
viewable data range of a selected area within performance 
assembly view 303 of performance assembly panel 302, for 
selecting a portion of performance assembly view 302 for 
such viewing, and for decreasing the viewable data range. 
Transport controls 362a through 362c provide video play- 
back control when previewing a video clip, audio clip and/or 
subtitle data using preview video panel 309, or when select- 
ing a representative video frame in a video clip as a preview 
thumbnail (as with exemplary thumbnails 323a and 3236). 
Transport control 362a halts video, audio and/or subtitle 
playback, transport control 3626 initiates/continues play- 
back and transport control 362c provides for per-frame 
("step") viewing, as will be understood by those skilled in 
the art. Start and end time selectors 363a and 3636 are used 
respectively for selecting and monitoring video, audio and/ 
or subtitle playback position and for setting and monitoring 
a playback stop time. 

Assembled elements panel 306 provides interactive and 
selectable listings of authored contents of a current movie 
title, including but not limited to movie volume 361, movies 
362 and menus 363. 

Log panel 308 provides selectable progress reports and 
other information relating to decoding/encoding of presen- 
tation data, compiling and layout of a disk file format 
according to DVD disk format specifications 205 (FIG. 2). 
These reports are automatically created and can be accessed 
using log tabs exemplified by tabs 381 and 383. 

Preview video panel 303 selectively displays a video 
frame corresponding to a cursor position over assembly 
panel chapter portion 350, video assembly portion 320, 
audio assembly portion 330, subtitle portion 340 and/or 
chapter portion 350 of assembly panel 302. In addition, 
preview video panel is used for previewing video data using 
transport controls 362a through 362c, start and stop time 
selectors 363a and 3636 or directly invoking the panel using 
selection or drag-and-drop capabilities. (As will be under- 
stood by those skilled in the art, encoded video and audio 
files are decoded and buffered, as needed, for playback in a 
conventional manner using MPEG encoder/decoder 130 of 
FIG. 1.) 

The following toolbar chart lists the respective elements 
of toolbar 399. It will be understood by those skilled in the 
art, in view of the discussion herein, that the toolbar ele- 
ments can vary substantially and includes user-defined 
expandable and replaceable elements. The elements shown 
are provided as defaults. 



Label 



Referenced as 



Description 



401 New volume Loads default values and adjusts the 

interface for a new movie title. 

403 New menu Loads default values and adjusts the 

interface for a new menu layout. 

405 New movie Loads default values and adjusts the 

interface for authoring a new movie. 

407 Connections Switches to an existing connections 

interface or adjusts the interface, 
according to default values for initially 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



10 



-continued 



Label 



Referenced as 



Description 



413-415 Cut, copy and paste 



421 
423 
425 
427 

429 



Compile start 
Compiler stop 
DVD Layout 
Write Tape 

Simulator 



setting connections. 
Provide conventional functions except 
as described herein for connections. 
Initiating compiler operation. 
Interrupts compiler operation. 
Invokes DVD Disk layout operation. 
Provides for output of multiplexed 
data stream to tape. 
Invokes simulator 



The FIG. 5 flowchart illustrates, by way of example and 
with reference to FIGS. 3 and 4, how an interface in 
accordance with the invention enables an author to assemble 
performance data and objects without consideration for 
structures, commands or ordered tasks imposed by DVD 
programming specifications 207 (FIG. 2). Select, open and 
drag-and-drop, among other operations, and clicking, 
double-clicking, click-and-drag and other user actions asso- 
ciated with graphic interfaces are well known and will not be 
further expounded upon herein. 

As shown, in step 505, an author initiates a new project 
("volume") by selecting new volume 401 (FIG. 4). In step 
510, the author initiates a new movie by selecting new 
movie 405. In step 515, the author adds video and audio files 
to presentation data-panel 301 (FIG. 3) for potential use in 
the volumeby movies and menus. In step 520, the author x can 
preview a video-file in preview panel 304'by dragging its 
icon ^presentation data ^panel-301 .tojpreview panel 304 
and/o^iJpdesired, -by invoking transport controls 362a 
through^3.62c, preview^ timer 393 and/or 'other playback- 
reiated r contr6lsrIn step 525, the author adds a selected video • 
clip to; the currently opened movie by double-clicking its 
icon in^pr^entation data panel 301 or by dragging the-icon 
from presentation data panel 301 to video assembly portion 
320 of performance- view 303. In step 530, the-author can 
select a video frame , mum^^ for 
reference viewing by ^dragging the pointer of thumbnail 
timer 325a:and/oiby^usifig transport controls 362fl through 
362?. 

In step 535, the author can preview an audio file by 
selecting its icon in presentation data panel 301 and using 
controls including stop 362, -play 362fe, using start time and^ 
end time selectors 363a^and 363 b and/or using ojher play- 
related controlsTIn step 540, the author adds a selected!audio^ 
clip;to:a next available track of the currently, opened movies 
by double-clicking -its'icon m^ple^ntation'dafa panel 301. 
(Alternatively, the author can add a selected audioxlipito a 
specific audio- tmck by dragging Jhe icon from presentation 
data panel 301 to selecjedtrack in^udio assembly^portion 
330 of performance vie w303. 1 In^tep;545, the author selects 
a language label by selecting selected language indicator 
335 and selecting a listed'element. 

In step 550, the author opens a subtitle frame and enters 
subtitle information for display in a video frame during 
playback of video clips. In step 555, the author selects a 
language label corresponding to the subtitle track containing 
the subtitle frame. If, in step 560, the author elects to add 
more performance data, then the author returns to step 520. 

In step 565, the author moves a cursor witrrin-chapter^ 
assembly portion- 350-of performance view;303 to- view" 
video frames available -as chapter^points. In step 570, the 
author selects a chapter point. If, in step 575, the author 
elects to add more chap : te1r-^oinU,:then:me^uthbr continues 
at^step.565 L 
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In step 580, the author selects an jLudk)Jrack number and design, and author-selected subpictwe 711 includes the 

optionally selects a subtitle track number and/or playback textural information, Dolby Demo 1, Dolby Demo 2, Play 

start and/of end times before selecting play button 3626 to Both Demos and Main Menu. Four author-created buttons 

preview playback of the video clip and the audio clip 720a through 720d including button frames 721a through 

referenced by the selected tralck number. 5 Hid are also shown. Each of button numbers 722a through 

The FIGS. 6a and 6b flowchart (with reference to FIGS. n2d ™ added by authoring program 201 (FIG. 2) in 
2 and 3) generally illustrates responses by the preferred response to creation of a respective button for identification 
authoring program 201 to an author's actions according to purposes (i.e. during authoring and for use in compilation), 
the invention. As shown, if in step 602 an author selects a Menu tools panel 702 comprises controls for implement- 
movie assembled in a prior authoring session, then, in step 10 ing selectable menu element parameters and for selectably 
604, data management engine 165 (FIG. 2) loads related altering the display characteristics of elements within menu 
parameters and, in step 606, sends the parameters to author- layout panel 701 during an authoring session. For example, 
ing engine 160. Otherwise, default parameters, for a new color selection boxes 732, 734, 736 and 738 allow an author 
movie are loadeS in step 608. to choose a button outline color for display (in a consumer 

In step 609, authoring engine 160 updates assembled 15 viewing scenario) when a button is not selected ("normal"), 

elements panel 307 (FIG. 3) and other affected interface when a consumer points at the button ("selection") and when 

160a elements to indicate the movie parameters. If, in step a button is invoked ("action") respectively. An author can 

612, the author selects presentation data files, then data also select the opacity of the buttons for these cases using 

management engine 165 loads and sends the respective opacity sliders 733, 735, and 737 respectively. Similarly, an 

presentation data file parameters to authoring engine 160 in 20 author can select button shapes and other characteristics by 

step 614, which updates presentation data panel 301 in step selecting one of the layout feature tabs 739 and utilizing the 

616. If, in step 622, the author assembles one of the selected t0Ql sets that appear in a respective tool set panel (not 

video clips, then authoring engine 160 accordingly updates shown). An author might, for example, utilize prior button 

video assembly portion 320, chapter assembly portion 350 sha P e » color > texture, opacity and/or normal, selection and 

and offset clock 352 in step 624, updates assembled ele- 25 activation color combinations used with a prior authoring 

ments panel 307 in step 626, and sends the video clip session as either a starting point for further changes or 

parameters to data management engine 165 for storage in without further modification. Other parameter combinations 

step 628. Similarly, if the author assembles one of the might also be utilized. Safe area toggle 755a allows an 

selected audio clips in step 632, then authoring engine 160 author to selectively display safe area indicator 755b of 

updates the selected track of audio assembly portion 320 in 30 memi layout panel 701 (which bounds an area that is assured 

step 634, updates assembled elements panel 307 in step 636, to be displayed on a consumer television). Display controls 

and sends the audio clip parameters to data management 75i and 752 provide for altering the characteristics indicated 

engine 165 in step 638. If, in step 642, the author assembles which, in light of the prior discussion, will be understood by 

subtitle data, then authoring engine 165 updates subtide those skilled in the art without further edification, 

assembly portion 340 in step 644, updates assembled ele- 35 Layout feature tabs 749 also provide access to button 

ments 307 in step 646, and sends subtitle data and param- ordering tools (not shown). As with other authoring 

eters to data management engine 160 in step 628. parameters, an author can selectively utilize an existing 

If, in step 652, the author^n^v^S in@ace 160a f pointer order of buttons that will be traversed in a currently dis- 

(e.g. a mouse pointer) within chapter assembly portion 3^0, played menu when a consumer pushes directional buttons on 

then in step 654 authoring engine 160 updates wallclock a remote control device. An alternative order can also be set 

351, finds an I-frame (i.e. a video frame that is completely usin S any number of methods including but not limited to 

described without reference to other frames) within the using a displayed remote control device or dragging an 

video clip corresponding to the mouse pointer position and arrow from a starting point to an ending point. Such features 

displays the I-frame in preview video panel 309. If, in step and their operational characteristics, given the foregoing, 

672, the author assembles a chapter point, then authoring 4 be understood by those skilled in the art without further 

engine 160 updates video assembly portion 340 and chapter edification. 

assembly portion 350 in step 674, updates assembled ele- The FIG. 8 flowchart shows how the actions required for 
ments panel 307 in step 676, and sends corresponding laying out a menu are consistent with those for assembling 
chapter parameters to data management engine 165 in step 5Q performance data. Once again, authoring is visually and 
678. interactively achieved without requiring any specific order- 
Trie FIG. 7 screenshot illustrates the preferred authoring in S of actions. Therefore, as with performance data 
window 300 of FIG. 3 with the performance data assembly assembly, the specific ordering of actions is given for 
panels replaced by panels for allowing an author to layout purposes of illustration only. 

menus. More particularly, menu layout panel 701 and menu 55 As shown, in step 805, the author selects background and 

tools panel 702 are selected, sized and positioned to replace subpicture files for inclusion in a menu layout. Selected files 

performance view 303 of FIG. 3. An exemplary menu layout will appear in presentation data panel 301 (FIG. 7). In step 

including graphic and textural images is shown in menu 810, an author adds a background and a subpicture to the 

layout panel 701 for purposes of illustration. Menu layout current menu by double -clicking on file listings, dragging 

panel 701 is used visually and interactively by an author to 60 the files to menu layout panel 701 or by using a similar 

retrieve, add, place and modify menu elements using menu method. In step 815, the author draws (i.e. drags a box) 

tools panel 702 selections. around subpicture text forrning a button frame, thereby 

In accordance with the DVD Specification, menu ele- indicating button placement directly in menu layout panel 

ments presentable to a DVD consumer can include a back- 701. If, in step 820, more button frames remain to be added, 

ground image ("background"), an overlay image 65 tnen the author returns to step 815. 

("subpicture") and up to twenty-five buttons. For the present In step 825, the author selects a button and sets shape, 

example, author-selected background 710 is a multicolor size, opacity and other param eters using preset combinations 
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and/or color selection boxes 732, 734, 736 and 738, opacity movie-title description) and DVD Specification require - 

sliders 733, 735, and 737 and/or other tools. In step 830, the ments (e.g. first play jump source), data management engine 

author sets the intra-menu button order in the manner 165 is further structured as a flexible network of data storage 

already described. If, in step 840, more menus remain to be and distribution objects that also reflects other abstractions 

created, then the author selects add menu button 413 in step 5 of toe invention. 

840, and returns to step 805. New elements appear in One further abstraction, for example, is a model of a DVD 
assembled elements panel 307 and control data (i.e. relating player, a consumer's controller and the compiled authoring 
to added elements and their layout characteristics) are sent instructions as an actively connection-switched network, 
to data management engine 165 (FIG. 2) as with perfor- Within ^ network DVD program chains representative of 
mance data assembly. 10 action-onented authoring instructions ("routers"), perform 
_ _ _ j i * j_t , A switching among available connections in response to DVD- 
The FIG. 9 screenshot illustrates a further selectable k ( - e instruclions> re . direc ting pro- 
configuration of the FIG 3 mterface for luikmg toge&er flow ^ CODt ; ol Control . receivin ch ; ins 
presentation data menu layouts, buttons within menu lay- ^ form more localized ^ ( such as ^ ^ „ 
oute and available control functions of a DVD player. As msm) Sla(ed a mutel m chain resolves 
shown, connection view ^ includes available targets panel 15 ^ from a DVD k control 
903 and linking panel 905. Lintarig panel 905 further tionb>a{ ^i viDgvmgI ^ ch ^ tWhichig&iatoutescontlol 
includes available > sources portion 950 and connected targets Qr ^ Further ^ sttaciioas also ^de 

portion 960. While connections view 901 is active, „ A y e u - e ~- u 

r . . , , , o , , , models or program chains for performuig a common base 

assembled elements panel 307 can further be used as a t *• i-r ■ • j ■ j 

, . , F . . , , on functionality in a same or similar manner using a derived 

selection means for navigating more quickly to a desired 20 nr '* m ^ • „ t ^,^„^ 

, , . , , & 0 , _ 3 common program chain structure, 

target withm available targets panel 903. „ L * j 1 u a -l-i-. j 

6 & r § ucn atl arrangement provides real world flexibility and 

Operationally an author forms a link or "available con- efficiency. For example, data management engine 165 sup- 
necuon" simply by copying (i.e. performing a copy action or ports authoring flexibility with regard to source-target con- 
dragging) a target from available targets portion 903 to a nections mat are swuchab i e . p^her, gi ven the power of 
position in connected targets portion 960 that is in the same cvcn COQVCntional computer systems, data management 
row as a desired source in available sources view 950. As cngine 165 ^ sufficiently robust to enable the interactive 
with assembling a movie and menu layouts, an author can 0 p Cr ation of interface 160a (FIG. 2) as well as minimal 
interactively remove, move or otherwise modify links in a compilation times of compiler 170 (i.e. only milliseconds) 
conventional manner. For example, a link can be removed direct int erface or DVD program specification 205 
by deletion or a target can be moved or copied to another correlation. Data management engine 165 is therefore also 
row m linking portion 905. readily adapta ble to interface variations and further 

As with arranging performance data and forming menu interfaces, as well as to compiler variations and other 

layouts, an author has easy and complete flexibility in compilers supporting other DVD and non-DVD data storage 

adding interactivity to a consumer's viewing experience. A 3S and/or retrieval applications. 

DVD movie can be authored, for example, such that entry Referring again to FIG. 11 and with further reference to 

and exit from a menu can be controlled by any available FIG. 2, data management engine 165 comprises a root 

event. Referring also to the FIG. 10 simulator window 1000, volume objcct im which managcs data management 

any menu button can further be linked to any DVD event, engine 165 communication and storage. Volume object 1100 

including but not limited to a chapter point (e.g. chapter ^ provides an interface for communicating messaged data to 

pomt 953), the end of chapter playback or depressing a DVD and from its component parts, including tide key jump 

remote control device menu button 1020 and 1040 (FIG. sourcc mi first play j ump sourcc 1102j media data base 

10). Aparticular menu button can also be used as a target in U03 DV D layout properties 1104, movies list 1105, menus 

multiple instances, as might be creatively appropriate. list 1106 and connections list 1107 (objects). Media database 

Thus, for example, a consumer interface can be quickly 45 1103 further includes media files list 1130, which stores 

and easily created which is interactively responsive pointers to media files referred to by the performance data 

("context sensitive") to a consumer's actions. Stated arrangement as a result of authoring, 

alternatively, an interface can be authored such that, for r n addition, each of the presentation data objects (i.e. 

example, the conclusion of a specific chapter playback or movies list 1105 and menus list 1106) and a connection sets 

menu button activation will determine a next chapter 50 u st object 1107 contain links to other data management 

playback, a next menu or even a next menu wherein an engine objects in the form of an object tree. More 

author-selected menu button is highlighted. specifically, movies list 1105 is linked to movie objects 

Among the reasons for such case and flexibility is that, movie-1 1150a through movie-M 11506, wherein M is the 

contrary to conventionally authored DVD movies, program total number of movies authored for storage on a single 

chains are not created during the authoring process. 55 DVD-ROM ("movie title"). Each movie object contains a 

Similarly, connections specified during authoring are not respective track list object 1151 and a respective chapter list 

permanent ("hard wired"). Rather, program chains are not object 1152. Each track list object 1151 contains respective 

created until compilation and available connections are not track objects, track- 1153a through track-T 1153/?, wherein 

fully resolved until playback, each according to additional T is the total number of tracks authored within a respective 

abstractions of the invention, as will be further discussed 60 movie. Track-1 through track-T further contain clip lists, 

herein. which in turn contain clip objects clip-1 1154a through 

The FIG. 11 block diagram illustrates the structure of a clip-CL 11546 (and wherein CL is the total number of clips 

preferred data management engine 165 (FIG. 1) according to in a given track within a given movie). Finally, each clip 

the invention. As illustrated, data management engine 165 object contains a respective clip properties object, as exem- 

only partially reflects the interface constructs and the struc- 65 plified by clip object 1155. 

tures of the DVD Specification. While reflecting interface Menu objects are structured in a manner similar to that of 

abstractions (e.g. a movie, menu and connection based movie objects. Menus list object 1160 contains menu objects 
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menu-1 1160a through menu-N, wherein N is the total 
number of menus authored for storage on a given DVD- 
ROM. Each menu object further contains a respective button 
list object (e.g. object 1161), each button list object contains 
a respective button objects (button-1 1162a through 
button-B 1 1626) and each button object is linked to a button 
properties object (e.g. object 1163). B indicates a total 
number of buttons in a respective menu. 

Finally, connections sets list 1107 contains respective 
connections lists (i.e. connect-list-1 1170a through connect- 
list-CL 11706), wherein CL is the total number of connec- 
tions lists authored for storage on a given DVD-ROM. Each 
connect-list is further linked to respective connections 
objects (i.e. connect-1 1171a through connect- CN), wherein 
CN is the total number of connections authored to facilitate 
flexible program flow and control. Each connections object 
(1171a through 11716) represents an action-oriented switch 
between a respective source and a respective target (as 
indicated by source-pointer variable 1172 and target-pointer 
variable 1173), as will be discussed further herein. 

Where applicable, each object includes an indexed object 
list having a pointer to each connected dependent object (i.e. 
an object "further down the tree" as illustrated), as well as 
a totals variable. The object list is updated to include new 
dependent objects as these objects are created 
("instantiated") to reflect, for example, an added chapter 
point or menu. Dependent objects are similarly removed 
from the object list according to authoring deletions. Totals 
variables are also updated during authoring to reflect each 
corresponding dependent object instantiation and deletion. 
Undo and redo operations are handled in a conventional 
manner using authoring instructions which are further con- 
ventionally stored within respective objects during each 
authoring session. 

Using this structure, data management engine 165 breaks 
down or filters control data generated during authoring into 
its basic component parts for storage in a corresponding 
object's indexed data list. These basic component parts are 
then retrieved by authoring engine 160, or retrieved and 
reconstructed into an applicable form by compiler 170, as 
needed. 

Operationally, data management engine 165 receives 
messages from authoring engine 160 in response to and 
reflecting each author modification of a performance 
assembly, menu layout or connection. Volume 1100 receives 
the message, polls its contained-objects list for a recipient 
object according to the message type, and sends the message 
to the matching recipient object. If the message includes a 
reference to a title key source or a first play source (which 
is author-slectable in connections view 901), then volume 
1100 sends the message respectively to either title key jump 
source 1101 or first play jump source 1102. Upon receipt, 
title key jump source 1101 or first play jump source 1102 
will accordingly store included data, delete stored data or 
modify stored data. 

If a received message includes a reference to a video, 
audio or subtitle file, then volume 1100 sends the message 
to media database 1103. If the message contains an instruc- 
tion to add a data element, then media database 1103 stores 
the data (which will include a pointer to a media file) in 
media files list 1130. If the message contains an instruction 
to delete a stored pointer, then media database 1103 deletes 
the pointer. If the message contains an instruction to modify 
a stored pointer (e.g. if the file was moved to a new location), 
then media database 1103 locates and replaces the file 
pointer. Media database 1103 further updates its totals 
variable to reflect additions and deletions. 
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If a received message type relates to the content of a 
movie arrangement, menu layout or connection, then vol- 
ume 1103 sends the message respectively to movies list 
1105, menus list 1106 or connections list 1107. Each of 

5 movies list 1105, menus list 1106 and connection sets list 
1107 operates similarly to objects described thus far. Each 
parses through a received message for included control 
information, sends the message respectively to a corre- 
sponding movie object, menu object or connections list and 

1Q adjusts its totals variable as needed. 

A movie message, for example, will then progress down 
through the movie object tree, and, depending upon the 
message type, will be filtered, by track list 1152, track-1 
1153a and then handled a matching clip, or will be filtered 

15 by chapter list 1152 and then handled by corresponding 
chapter or by a clip properties object (i.e. as illustrated). 
Menu layout data will similarly progress (as illustrated) 
down through the menus list tree, being handled by a 
matching menu properties object, and connections data will 

20 progress down the connection sets list tree until it is handled 
by a connection object (with reference to its source pointer 
or destination pointer variables). Upon receipt, a clip 
properties, menu key, end key, menu properties or connec- 
tion object will handle the message and store included data, 

25 delete stored data or modify stored data in a similar manner 
as with media database object 1103. 

Each respective storage object stores authoring modifica- 
tions in a sequentially indexed list according to its type (i.e. 
each object name is illustrated to reflect the data type the 

30 object stores). Thus, for example, chapter points within a 
movie are stored from a first chapter point during playback 
to a final chapter point in the movie. (Playback will however, 
be determined by authored connections.) The list accommo- 
dates added, inserted or deleted data interactively by 

35 expanding or contracting about the addition, insertion or 
deletion point. 

While other data structures might be utilized, interactively 
adjusted indexed lists and limited object definitions, using 
even a minimally equipped computer, are sufficiently robust 

40 to accommodate an author's input rate, given the relatively 
small amount of data stored in each list. Alternative struc- 
tures that might be used, for example, include but are not 
limited to a lesser number of objects each containing a less 
restricted dataset and/or the addition of summary objects for 

45 storing total numbers of menus, buttons and system other 
status and/or statistical information. Such arrangements 
however, have been found to add complexity with only 
moderate gains in application-specific operational charac- 
teristics. Alternative data structures, including but not lim- 

50 ited to multi-dimensional arrays, multiple queues and linked 
lists stored locally and/or remotely, present similar tradeoffs. 

Data management engine 165 returns stored data to 
authoring engine 160 in a manner essentially the reverse of 
that for storing data. Volume 1100, upon receipt of a request 

55 for stored data, parses the request call for a data type, 
searches its contained objects list for a corresponding object, 
and forwards the request to title key jump source 1101, first 
play jump source 1102, media database 1103, DVD layout 
properties list 1105, movies list 1106, menus list 1107 or 

60 connection sets list 1107. Movies list 1105, menus list 1106 
or connection sets fist 1107, upon receipt of such a request, 
parses its to available objects list and forwards the message 
correspondingly to a movie object, menu object or connec- 
tion list object, and so on, until the message is received by 

65 a last recipient object. The last recipient object then retrieves 
the requested data and sends the data in the reverse direction 
of request receipt until the data reaches volume 100. Volume 
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1100, upon receipt of the data, sends the requested data to according to the number of corresponding authored element 
authoring engine 160. (Error handling and messaging func- types, (e.g. the number of menu buttons in a given menu), 
tionality are otherwise handled in a conventional manner.) i n s t e p 1203, compiler 170 resolves source-target con- 
Data management engine 165 further responds to queries nections as indices to source and target identifier informa- 
from authoring engine 160 for purposes such as totaling the 5 tion within data management engine 165. In step 1205, 
number of data elements of a given type or for reviewing the compiler 170 replaces the indices with identifier information 
contents of a particular object's data list. As with data which is retrieved by further querying data management 
storage and retrieval above, data management engine 165 engine 165. 

receives a call from authoring engine 160 requesting infor- FIG. 13 illustrates a preferred PGC layout structure 

mation. Volume 1100 parses the message, polls its available io accor d mg to the invention. As shown, the PGC layout 

objects list and sends the message to a corresponding object. structure is divided into a single first play PGC space 1301 

For objects linked to a tree-structure, such as movies list (i n accordance with the DVD Specification), a single video 

1105, menus list 1106 and connection sets list 1107, the manager ("VMGM") domain 1302, and one or more video 

message is forwarded down through respective objects as tide set ("VTS") domains (e.g. 1303 and 1304) according to 

already discussed, and a last recipient object will respond. If 15 me Qumber of mov i es ^ the movie tide, 

the message requests, for example, a total number of data The fefred VMGM domain PGC layout structure 

elements of a given type then a last recipient will either poll ft ^ ^ k pGC abstracti 1321 md a 

its totals variable or, if necessary, poll its data list for ^ moyie ^ pGC abstractioQ 1322 ^ ereafter> the 

corresponding data, count tne number ot corresponding VMGM PGC structure includes 2 menu PGC abstractions 

occurrences and return a response including the total THe 20 ^ ^ ^ for ^ ^ ^ ^ 

response k sent back through the tree structure to volume pGC abstraction for each end command (in each movie) that 

1100, which sends the message (including the total) to m author for which „ author has ^ a connection . M 

authoring engine 160. Uiven tlie relatively smaU number ot ^ be discussed each 

menu PGC abstraction pair 

objects, alternatives (such as asynchronous multiple- a menu PGC ( e . g . i 3 23 a and 1324a) and a 

messaging and, in particular, broadcast messages) add some 25 ^ ^ pGc ^ ^ 

expediency, but with unnecessarily added complexity. „ . . , 

a . ■ ■ , .. . .. Each VTS domam PGC layout structure (e.g. 1303) 

As with the authoring engine interface objects, the object indudes a movie ^ , pGC ^ ^ a videQ ^ J t 

types, inter-object messaging protocol and data objects menu rvrSM))) TO 1332 area consis(s of 

utilized in data management engine 165, in view of the from Qne to four femotc k rou , er pGCs ( remolc k 

disclosure herein, will be apparen to mose skilled in the rQUter pGCs 1332) d din ^ numb „ of ^ &acnt 

computer arts. Preferably available object hbranes from remote k nec6SsarV( given ^ preferred layout 

Microsoft® are utilized For example the preferred avail- stm tQ n ^ ^ ch ^4 

able objects and data lists utihze Standard Template Librar- ■„„ QM ™~fi^n„. 

J , . „ jlitj jx^rcj/ using connection view 901. More specifically: 

les and, in particular, Expandable Indexed Buffered/ „ „ , „ V, . . 

Vectored Lists. Such objects provide robust response with 35 numb * r of ™uter PGCs in a given VTSM- 

the flexibility of expandable lists and indexed vectors for total . number of cha P ter ? omts m a corresponding 

easy lookup in light of the typically small number of objects movie/25 (rounded, if a non-integer, to a next higher 

and datasets, among other factors. As noted earlier however, integer value). 

use of an object-oriented architecture and/or the specific data In each case > attem P l has been made to minimize the 

structures are not essential and many conventional alterna- 40 number of PGCs wlthout detrimental impact on flexibility, 

tives can be utilized Thus, while the number of PGCs is as indicated above, 

... , iL , . , . „ , complete authoring flexibility with regard to connecting 

As discussed, the particular arrangement of obiects of the „ _ u «* a * »• j . . 

p j j a . tV* . r t * menus, menu buttons and presentation data without concern 

preierrea data management engine 165 is preferred accord- for Umitations of me DVD proeramm 

ing specification 207 

ing to its flexibihty performance and adaptability among 4J (plG 2) ^ ided VmJ J £ S ^ ^ ^ 

ouner lactors. it snou a De noted tnererore, mat any number am limitations 

is also minimized. 

of modificauons wiU be apparent according to the teachmgs Fof 2 ^ number of remQte ^ pGCs 

and W1 thin the spint and scope of the invention. ^ calcu]ation reflects ^ J ch chapter ^ 

FIGS. Ma through 15, with reference to FIGS. 2 and 11, abstraction requires more than four commands. This in turn 

illustrate compilation according to a preferred embodiment 50 reflects that, only one hundred twenty eight commands are 

of the invention. allowable in a single PGC chain in accordance with the 

As shown generally in FIG. 12a, compiler 170 (FIG, 2) DVD programming specification 207. While not essential, 
preferably operates on data entered through the authoring placing each abstraction completely within a separated 
process into the interface 160a of authoring engine 160 chains and in equal numbers throughout like chains provides 
(FIG. 2) and stored by data management engine 165 in three 55 an efficiently symmetrical structure. Since DVD program- 
stages. In step 1201, compiler 170 builds an intermediate ming specifications 207 provide for up to ninety nine chapter 
skeleton-form PGC layout data structure. The skeleton-form points per movie, a maximum of four PGC abstractions is 
PGC layout data structure is preferably formed according to required without detrimental impact in terms of connect- 
DVD program code segment ("program chain" or "PGC") ability. Considering the same parameters and calculations 
abstractions and a network abstraction according to the 60 for menus however, it is seen that only twenty five menu 
invention, utilizing only summary data gathered from data buttons are available per menu without limitation on con- 
management engine 165. Broadly stated, each PGC abstrac- nectability. In practical terms however (i.e. displaying a 
tion is preferably comprised of pre-deterrnined command menu on a conventional television set), this number does not 
combinations, wherein the number of PGCs of a given type present any practical detrimental effect, 
and the number of command combinations of a given type 65 The use of consecutive locations in the PGC layout 
(e.g. button command combinations) are determined accord- structure greatly simplifies the task of finding specific PGCs 
ing to either a default value (e.g. typically one PGC) or relating to specific data types and further for resolving PGC 
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connections. A movie title PGC will always be the first 
element, a movie router PGC will always be the second 
element, and a display menu PGC can always be located 
merely by adding a known constant plus two times the menu 
number, etc. 

Those skilled in the art will appreciate however, in view 
of the discussion herein, that the PGC abstractions provide 
for other than consecutively arranged elements as an 
indexed list in memory 120 (FIG. 1). Such alternatives, for 
example, include but are not limited to multiple lists, queues 
and/or multi-dimensional arrays stored in memory, in other 
media, and/or in more than one media either locally or in a 
distributed fashion, as with data management engine 165. 
Such methods can be useful where more than one authoring 
location or other distributed environments are utilized. 

The FIG. 126 flowchart, with reference to FIG. 13, shows 
in greater detail how compiler 170 constructs a preferred 
PGC layout data structure in an initial skeleton form. As 
shown, compiler 170 begins by storing a first play PGC 
abstraction, a title key PGC abstraction and a menu router 
PGC abstraction into PGC layout structure 1300 (FIG. 13) 
in steps 1207, 1208 and 1209 respectively. Next, in step 

1213, compiler 170 queries data management engine 165 for 
a total number, MenusTot, of menus authored and, in step 

1214, initializes a menu pointer, MenuPtr. In step 1215, 
compiler 170 queries data management engine 165 for a 
total number, ButtonsTot, of buttons authored in a current 
menu (e.g. initially, a first menu). MenusTot will specify the 
number of predetermined menu display and menu button 
router PGC abstractions (i.e. "menu PGC abstraction pairs*') 
that compiler 170 will add to the structure, while ButtonsTot 
will specify the number of commands that compiler 170 will 
add to each PGC of a current menu PGC abstraction pair. 

In step 1216, compiler 170 adds a menu PGC abstraction 
pair to VMGM PGC structure 1302 (FIG. 13) corresponding 
to the existence of and the number of buttons in a current 
authored menu (e.g. initially, a first menu). If, in step 1217, 
one or more menus are not yet added to VMGM PGC 
structure 1301, then in step 1218, compiler 170 increments 
the menu counter and returns to step 1211. 

At this point, compiler 170 lacks any authoring informa- 
tion other than MenusTot and a respective ButtonsTot value 
for each current menu. A similar same lack of further 
authoring details will also exist for other PGCs in the 
skeleton-form, PGC layout structure. The preferred PGC 
and network abstractions of the invention however, enable 
compiler 170 to accommodate missing authoring details 
merely by inserting null values ("no-ops") into the com- 
mands of the abstracted PGCs for unknown connection 
information (i.e. source- target identification information). 
As discussed, compiler 170 will preferably resolve these 
no-ops later in compilation. These abstractions further 
enable menu PGCs to be created independently of movies 
and movie arrangements. Thus, independently created/ 
conceived menu PGCs provide extensive flexibility, allow- 
ing an author to link any available menubutton of any menu 
to any potential target using a u^PMendly^inTerf a'ee such as 
the preferred connection view 901. 

If instead, in step 1217, all authored menu layouts are 
reflected by corresponding menu PGC abstraction pairs, 
then compiler 170 proceeds to step 1219. In step 1219, 
compiler 170 queries data management engine 165 for the 
total number, MovieTot, of movies, which compiler 170 will 
use to create end commands, VTSs and VTS contents. In 
step 1221, compiler 170 initializes a current movie pointer 
("MoviePtr"), as well as two counters, "EndTot" and 
"Remote". Compiler 170 will use EndTot to count the 



number of available end-of-chapter conditions in each 
movie for which an author has specified connections and 
will use Remote to count the number of available playback 
interruption conditions (i.e. by a user pressing a DVD-player 
control, typically on a remote control device) for which an 
author has specified connections. 

In step 1223, compiler 170 queries data management 
engine 165 for the total number of chapters ("CbapterTot") 
in a current movie (e.g. initially, the first movie) and, in step 
1225, initializes a current chapter pointer ("ChapterPtr"). If, 
in step 1227, the author has specified a target for the current 
chapter, end-of-chapter condition (i.e. using connection 
view 901), then, in step 1229, compiler 170 increments 
EndTot; otherwise, compiler 170 proceeds to step 1231. 
Similarly, if, in step 1231, the author has specified a target 
for the current chapter, remote-control key playback inter- 
ruption ("remote-key") condition, then, in step 1233, com- 
piler increments Remote; otherwise, compiler 170 proceeds 
to step 1235. 

The existence of authored connections is determined 
similarly for both end-of-chapter and remote-key conditions. 
Preferably, objects 1101-1163 (FIG. 11) contain actual 
source and target identifier information (i.e. corresponding 
to authored sources and targets), while the connection 
objects (e.g. 1171a) contain pointers to data stored by these 
objects. Stated alternatively, as a new potential source is 
authored, a connection object is instantiated, including a 
source pointer that points to the potential source and a 
null-value target pointer; if an author later connects such a 
source, then the corresponding connection-object target 
pointer value is replaced by a pointer to the target object. 
(Subsequent editing by an author correspondingly deletes or 
instantiates a connection object and/or changes a source 
pointer or target pointer value.) 

Therefore, compiler 170 determines the existence of a 
connected end command by first querying each connection 
object for a source pointer pointing to the currently selected 
chapter-object Once found, compiler 170 checks the corre- 
sponding target pointer. A null-value target pointer indicates 
an unconnected end command while a non-null-value target 
pointer indicates the existence of a connection. Remote key 
(i.e. "menu key" in FIG. 11) connections are similarly 
determined by finding an identifier in a current chapter menu 
key object (e.g. 1157), finding the corresponding source 
pointer in one of the connection objects, and then querying 
the connection object for the existence of a corresponding 
non-null-value target pointer. 

Those skilled in the art, in view of the foregoing, will 
appreciate that considerable variation of the above structure 
will provide the same, related or similar functionality. For 
example, identifiers, labels and even complete movie tree, 
menu tree and/or other objects could well be contained 
within or duplicated within the connections -tree (i.e. objects 
1107-1173). A single connection object could also be used 
(i.e. having a single list of all connections), as could con- 
55 nection objects that remain despite the deletion of a source. 

Other variations are also anticipated. The current structure is 
x however, preferred in that it provides a compilation time of 
only a few milliseconds, minimizes memory usage and 
further facilitates debugging, emulation, simulation and 
so overall symmetry by separating these objects (and their 
contained data). In simulation, for example, the restrictions 
imposed by the DVD Specification are not controlling and 
simulation can therefore more efficiently utilize authoring 
data directly from the preferred, non-integrated data man- 
agement engine 165 object structure. 

Returning now to FIG. 12fe, if, in step 1235, more chapters 
remain in the current movie, then compiler 170 increments 
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ChapterPtr and returns to step 1227; otherwise, compiler If, in step 1283, more chapters remain unresolved, then 

170 proceeds to step 1237. In step 1237, compiler 170 adds compiler 170 increments the chapter pointer in step 1285 

a 1—4 PGC, end command router PGC abstraction to layout and returns to step 1271. If instead, no chapters remain 

structure 1300 (FIG. 13). In step 1238, compiler 170 creates unresolved in the current movie, then compiler 170 proceeds 

a VTS domain for the current movie (i.e. including a 5 to step 1286. In step 1286, compiler 170 queries data 

VTSM), adding to the VTS domain a movie display PGC in management engine 165 (i.e. via volume 1100 to media 

step 1239 and adding a 1-4 PGC, remote key PGC abstrac- database 1103 of FIG. 11) for all audio and video file 

tion in step 1240. references which reference the current movie. In step 1287, 

If, in step 1241, more movies remain in the current movie compiler 170 invokes multiplexer 185, which retrieves the 

title (i.e. tested by comparing MovieTot with MoviePtr), 10 referenced audio and video files and outputs a resultant 

then compiler increments movie Ptr in step 1243, multiplexed data file in a conventional manner and in 

re-initializes EndTot and Remote in step 1245 and returns to accordance with the DVD disk format specifications 205 

step 1223. Otherwise, formation of a PGC layout structure (FIG. 2) of the DVD Specification, 

in skeleton form has been completed. If, in step 1288 more movies remain unresolved in layout 

The FIG. 12c flowchart with reference to FIG. 11 shows 15 structure 1300, then compiler 170 resets pointers for the next 
how compiler 170 replaces the no-ops in (skeleton form) movie and first chapter in step 1289 and returns to step 1271. 
PGC layout structure 1300 with indices (i.e. source or target Otherwise, compiler 170 (in a similar manner) resolves first 
pointers) to respective sources and targets, and then further play, title key jump source and menu router no-ops respec- 
replaces the indices with element identifiers. In step 1251, lively in steps 1291, 1293 and 1295. Then, in step 1297, 
compiler 170 initializes a movie pointer ("MoviePtr") to a 20 compiler 170 saves the PGC layout structure as a stored file, 
first movie, a chapter pointer ("ChapterPtr") to a first With regard to FIG. 12b and 12c, total authored element 
chapter, a menu pointer ("MenuPtr") to a first menu and a values (i.e. such as MenusTot and ButtonsTot) are main- 
button pointer ("ButtonPtr") to a first button, tained on an ongoing basis in a corresponding list object or 

In step 1253, compiler 170 queries data management the functional equivalent of a list object as already dis- 

engine 165 (i.e. connection-objects) for a source-pointer to 25 cussed. For example, movies-list object 1105 (FIG. 11), in 

a next (initially, a first) author-connected button. As dis- addition to a list for containing references to all instantiated 

cussed earlier, the connection object checks its source- movie objects, also contains a variable for updating the total 

pointer for a corresponding source having a corresponding number of movies in a current movie title during the course 

non-null-value target pointer. Since specific connection val- of one or more authoring sessions. Similarly, button-list 

ues (rather than the existence of a connection as with FIG. 30 object 1161 contains a list of instantiated button objects (e.g. 

12b) are required in this case, the query utilized results in the 1162a through 11626) as well as a variable indicating the 

return of such a source-pointer. In step 1255, compiler 170 total number of buttons in menu-1. Other list objects simi- 

uses the returned source-pointer to query data management larly include ongoing totals which are updated during the 

engine 165 for the corresponding target -pointer and, in step course of authoring. One reason is that some early - 

1257, compiler 170 uses the returned indices to query data 35 generation DVD-players limit the available memory space 

management engine 165 (e.g. via volume 1100, menu- 1 for storing PGCs, which correspondingly limits the number 

1160a and button list 1161 to button-1 1162a) for the source of elements (e.g. menus, menu buttons and chapters) that the 

and target identifiers corresponding to the source and target invention permits to be authored. These limits and/or current 

pointers. Then, in step 1259, compiler 170 replaces the totals are therefore selectively conveyed to an author 

current button command no-ops (of the current menu PGC 40 through interface 160a. Ongoing totals are also beneficial in 

abstraction pair) with the returned identifiers. that no time periods are required during compilation for 

If, in step 1261, more buttons remain unresolved in the calculating such totals, 
current menu, then compiler 170 increments ButtonPtr in As will be understood by those skilled in the art however, 
step 1263 and returns to step 1253; otherwise, compiler 170 total values might become unimportant for other than corn- 
proceeds to step 1265. If, in step 1265, menus remain 45 pilation purposes as DVD-players are manufactured with 
unresolved, then compiler 170 increments MenuPtr and increasing resources in conformance with the current DVD 
resets ButtonPtr to one in step 1267, and then returns to step Specification, in accordance with expanded DVD capabili- 
1253; otherwise, compiler 170 proceeds to step 1271. ties and in accordance with the requirements of non-DVD 

Having resolved and replaced all menu button no-ops, systems. In such cases, totals can alternatively be calculated 

compiler 170 next resolves all chapter end-command and 50 during compilation. 

remote-key PGC abstraction no-ops in a similar manner. The use of preferably pre-determined PGC abstraction 
Compiler 170 queries data management engine 165 for a types comprising preferably pre-determined command com- 
(next connected) current chapter end command source- binations and the preferred PGC layout structure are thus 
pointer in step 1271, uses the returned source-pointer to factors in providing a maximized authoring flexibility and 
query data management engine 165 for a corresponding 55 efficient compilation among other benefits. Available con- 
target-pointer in step 1272, uses the pointers to query data nections remain completely flexible during authoring and, in 
management engine 165 for corresponding identifiers in step fact, until substitutions are made for no-ops during compi- 
1273 and replaces corresponding layout structure 1300 PGC lation. The preferred structures of PGC abstractions further 
commands with the returned identifiers in step 1274. add to compilation efficiency, since a skeleton can be formed 
Similarly, compiler 170 queries data management engine 60 with only summary authoring data, and then authoring 
165 for a (next connected) current remote key source-pointer details can be quickly added thereafter, 
in step 1277, uses the returned source-pointer to query data FIGS. 14 and 15, with reference to FIG. 13, illustrate a 
management .engine 165 for a corresponding target-pointer preferred network or "connection-switching" abstraction 
in step 1278, uses the pointers to query data management according to the invention. The connection -switching 
engine 165 for corresponding identifiers in step 1279 and 65 abstraction, while operationally active only during playback 
replaces corresponding layout structure 1300 PGC com- of a movie-title, is also a factor in determining PGC abstrac- 
mands with the returned identifiers in step 1280. tions produced by compiler 170 as well as the movie, menu 
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and connection movie-title abstraction utilized by data man- 
agement engine 165, interface 160a and authoring engine 
160 (FIG. 2). 

Details of the DVD Specification including but not lim- 
ited to multiplexed data stream and DVD player 5 
configurations, data formats, protocols and loading of data 
are known to those skilled in the art and will therefore be 
discussed only to the extent required for an understanding of 
the invention. 

DVD programming specifications 207 (FIG. 2) provide 
that PGCs can reside (along with the corresponding presen- 
tation data) in virtual structures including a first play space, 
a video manager ("VMGM") and any of 99 video title sets 
("VTSs"), each of which includes a video title set menu 
space ("VTSM"). Among the limitations of this virtual 
structure however, is first that a PGC in an initial VTS or 15 
VTSM cannot directly trigger (i.e. jump to, using a DVD 
jump command) a PGC stored in another VTS (or VTSM). 
For example, while a PGC in an initial VTS can "playback 
a chapter of presentation data" and the conclusion of chapter 
playback can trigger a "followup" PGC, the folio wup PGC 20 
cannot be stored in a different VTS. Similarly, an initial PGC 
used to respond to DVD consumer menu-button activation 
cannot trigger a second PGC which is stored in a different 
VTS. A further relevant limitation is that the format of 
performance data must remain constant within a given VTS. 25 
So, for example, a video data stream having one aspect ratio 
cannot be stored in the same VTS with another video data 
stream having a different aspect ratio. 

The FIG. 14 functional diagram illustrates how the pre- 
ferred connection-switching abstraction provides a flexible 30 
and robust functional superstructure within which movie - 
title, DVD-player and interactively occurring consumer- 
control events are routed and executed. In the figure, VTS -A 
1303 and VTS-A+1 exemplify any two different VTSs which 
have been created during compilation of a movie-title. It 35 
should also be noted that the illustrated connection arrows 
only denote the "path" from one box (i.e. PGC abstraction, 
PGC or command-set) to another that can result from an 
author's use of connection view 901 (FIG. 9). Thus, fewer 
connections than those illustrated might be authored and 40 
each path from one box to another is accomplished indi- 
vidually using a single "jump command" or a single transfer 
of control by a DVD-player. (The use of multiple connected 
arrows and shared arrows is used only for clarity sake, since 
the alternative use of individual arrows between each pair of 45 
boxes might otherwise obscure the invention.) 

Within each VTS, only a movie display PGC abstraction 
operates as a "control PGC (i.e. directly controls menu 
and/or movie display). For example, VTS-A 1303 includes 
movie display PGC abstraction 1331 and (within its VTSM 50 
domain 1322) remote key PGC abstraction 1322a. Movie 
display PGC abstraction 1331 comprises a single PGC 
which includes a command-set ("pre-command") for select- 
ing a chapter and initiating playback of the chapter, as well 
as an end command "cell command" that initiates routing 55 
upon the occurrence of an end-of-chapter-playback condi- 
tion. Remote menu key 1431a, which denotes an automatic 
DVD player function, traps and forwards a remote-key 
condition (i.e. user depression of a remote menu key which 
interrupts playback). Remote menu key router PGC abstrac- 60 
tion 1322a of VTSM-A 1322 sets the authored target for a 
corresponding remote menu key condition (i.e. where a 
consumer presses a remote menu key during playback) and 
then routes control to a corresponding movie PGC abstrac- 
tion or menu PGC abstraction within VMGM 1302. Other 65 
VTSs (e.g. VTS-A+1 1304) are similarly structured for each 
movie within the current DVD movie-title. 
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Each remote menu key router PGC abstraction includes 
up to 4 PGCs to accommodate the up to 99 chapter points per 
movie limitation of the DVD Specification. The first remote 
menu key PGC is always assigned as a root menu and is 
always a hardwired (i.e. unalterable) target for any remote 
menu key condition (in accordance with the DVD 
Specification). Therefore, in order to provide for chapter 
dependent routing of a remote menu key condition, a DVD- 
player system register must first be queried for the last 
played chapter. Using the returned last played chapter 
information, program execution is then diverted to the 
corresponding authored remote menu key router PGC. 

VTSM 1302 comprises the discussed menu display PGC 
(e.g. 1322) and menu button router PGC (e.g. 13236) 
abstraction pairs (for providing menu control), as well as the 
remaining router PGC abstractions. More specifically, 
movie router PGC abstraction 1322 acts as a playback 
bridge between VTS domains, receiving control from a 
remote key PGC in a first VTS (e.g. remote key PGC 1322a 
of VTS 1303) and then forwarding control to a movie 
display PGC abstraction in second VTS (e.g. movie play 
PGC 1341 of VTS 1304). In contrast, end router PGC 
abstractions (e.g. 1325 and 1326) can be author-connected to 
route control from an end-of-chapter condition to either a 
selected chapter in a selected movie, or to a selected menu 
button in a selected menu. 

As shown, a separate PGC is provided for each author- 
connected end-of-chapter condition. Each end command 
router PGC abstraction is paired with (i.e. responds to) a 
specific end command such that each end-of-chapter condi- 
tion for a given movie will be routed from the end command 
to a unique end router PGC abstraction. Separate end 
command PGCs are required due to a flaw in current 
generation DVD -players whereby the last played chapter is 
not reliably available at the end of chapter playback. Upon 
correction of this flaw in future generation DVD-players 
however, end command routing can be accomplished in a 
manner consistent with remote menu key PGC abstractions 
(i.e. using only up to four end-command router PGCs per 
movie). 

A menu display PGC abstraction (e.g. 1323a), when it 
receives control as a target and thereafter while a consumer 
continues to depress menu navigation buttons, effectuates 
control by highlighting a menu button and displaying the 
menu. If however, a consumer activates a menu button, then 
the DVD-player initiates the corresponding router PGC 
abstraction (e.g. 13236), which routes control (i.e. according 
to an authored connection) to either a movie display PGC or 
to a menu display PGC. 

For clarity sake, the first play PGC abstraction 1301 and 
title key PGC abstraction 1321 (FIG. 13) are not shown in 
FIG. 14. Each operates to transfer control to either a menu 
display PGC or a movie display PGC as with the end 
command router PGCs and menu router PGCs. First play 
PGC 1301 is stored in a separate DVD -player storage 
location, while tide key PGC 1321 is stored in VMGM 1302. 

While those skilled in the art will appreciate, in view of 
the discussion herein, that considerable variation might be 
utilized, iterative experimentation with different connection- 
switching abstractions and DVD players has revealed a 
number of considerations. For example, command execution 
delays will necessarily occur as a result of PGC execution 
and greater. delays typically result from transfer of control 
between a VTS (e.g. 1303 and 1304) and VMGM 1302. 
Another example is that a delay occurring prior to the start 
of a movie is observed to be more acceptable than a similar 
delay during navigation through what can be a large number 
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of menus. A still further example is that consistent delay For clarity sake, the operation of preferred connection- 
periods for similar transitions is more acceptable than incon- switching abstraction 1400 will also be discussed, by way of 
sistent delays for similar transitions. example, with reference to FIG. 14. If, for example, an 
Thus, the preferred connection-switching abstraction pro- authored-connection for first play is set to begin playback of 
vides a generally symmetrical structure wherein delays are 5 a nrs t chapter of a first movie stored in VTS-A 1303, then 
first minimized by source-router-target execution paths hav- upon insertion G f tne DVD movie-title into a DVD-player, 
ing a minimum number of PGGs and PGC commands. m0 vie display PGC abstraction 1331 wiU be invoked. Movie 
Movie display PGC abstractions are further placed similarly dis u pGC 1331 win sclect ^ initiate playback of the 
within each VTS, while menu PGC abstraction pairs are g ^ chanter 

placed similarly within VMGM 1302. (Note that an author „ rt 1( iU a \ . t , . i ■ • * * j u * 

f ■ „ , . tU j j r i * i_ . 10 If the nrst chapter playback is interrupted by a remote 

typically only connects the end command of a last chapter , j..f f L J M7F ^ * « 4 

within any given movie, such that the DVD-player will menu key condition, then the DVD-player wdl automaU- 

continuously play all chapters with the movie before control call y tra P thc '^l 011 < LC ' box 1431 f> and wdl the r 
is routed outside the corresponding VTS). In addition, movie ^L??^?^ of rcm °i e mem f k ? y "juter 1322n of 
router 1322 is only used for VTS-to-VTS transitions This VTSM-A 1322. Assuming further that less than 25 chapters 
reflects, for example, that inconsistent delay between movie- 15 exist m me first movie, the root menu PGC of remote menu 
to-movie playback and menu-to-movie playback is more key router 1322a (i.e. now the current source PGC 
acceptable than imposing further delay on menu-to-movie abstraction) will set the author-selected target for the first 
playback or other alternatives. (For example, further distri- chapter remote menu key condition and will route control to 
bution and/or re-distribution of movie and/or menu routing either movie router 1322 or a menu display PGC (e.g. 1323 
functions have been observed to produce subjectively less 20 or 1324) within VMGM 1302. If movie router 1322 receives 
acceptable results.) In addition, movie router 1322 complex- control, then upon receipt, movie router further routes 
ity and PGC length is therefore reduced. It should be control to the author-connected movie display PGC, in this 
understood however, that these already short delay periods case, movie display PGC 1341 of VTS-A+1 1304, which 
will further decrease as advances are made in DVD-player will set and initiates playback of the author-selected chapter 
technology and that the resulting decreasing importance of 25 of the VTS-A+1 movie. 

such considerations might well contribute to further If instead, playback of the first movie is not interrupted 
connection-switching abstraction variations. and only the last chapter of the first movie includes an 

The FIG. 15 flowchart broadly illustrates the operation of author-connected end command, then the DVD -player will 
preferred connection-switching abstraction 1400. In step continue to play successive chapters of the first movie until 
1503, first play PGC abstraction is invoked in response to 30 the conclusion of the last movie. At the conclusion of the last 
insertion of a movie-title into a DVD-player. The first play movie, the DVD-player will execute cell command 14316 
PGC abstraction (i.e. now the current PGC abstraction) (i.e. end command), which will transfer control to the PGC 
determines target information (i.e. a target identifier and, if in end router 1325 (in VTSM 1302) that corresponds with 
needed, target parameters). If, in step 1505, a router is the chapter last chapter played, i.e. the last chapter of VTS-A 
required, then, the current PGC abstraction routes the target 35 movie. (Since, in this case, only one chapter in the VTS-A 
information and control to a next router abstraction in step movie has a connected end-of -chapter playback condition, 
1507 and operation returns to step 1511. If no router is end router 1325 will include only the one corresponding 
required in step 1505, then, in step 1509, the current PGC PGC.) 

abstraction routes the target information to the target PGC Upon receipt of control from end command 1431, end 
abstraction. 40 router 1325 (i.e. now the current source) will set the corre- 

If, in step 1511, the target is not a chapter (i.e. playback sponding author-connected target included in end router 
of a chapter is not the resultant authored event) then the 1325. Assuming the target is the VTS-A+1 movie, end router 
target displays a menu (i.e. according to the target 1325 will further route control to movie display PGC 1341 
information) in step 1513 and the DVD-player waits for a of VTS-A+1 1304, which will set and initiate playback 
menu button to be selected (i.e. step 1513 through 1515 act 45 according to the chapter of the VTS-A+1 movie set by end 
as a wait loop). If, in step 1515 a menu button is selected, router 1325. (Since control is not being routed from one VTS 
then the current PGC abstraction sets authored target infor- to another VTS, movie router 1322 is not utilized.) 
mation for the selected button in step 1517 and operation If instead, the current source PGC of end router 1325 (i.e. 
returns to step 1505. again, the only PGC in end router 1325 in this example) 

If instead, in step 1511, the target is a chapter, then the 50 includes an author-selected connection to menu N 1323, 
target initiates playback of the chapter. If further, in step then end router 1325 will set target parameters and will route 
1525, a consumer invokes the remote menu key during control to menu display PGC 1323a. Menu display PGC 
playback of the chapter, then the current PGC abstraction 1323a will highlight the button of menu-N 1323 according 
sets authored target information in step 1527 and operation to the received target parameters and will then display 
returns to step 1505. If, in step 1525, the remote menu key 55 menu-N 1323. Menu display PGC 1323a will thereafter 
is not invoked (i.e. the chapter plays uninterrupted to its continue to be invoked by the DVD-player and will continue 
conclusion) and a chapter end command target has been to highlight a button and display menu-N 1323 correspond- 
authored, then the current PGC abstraction sets the authored ingly with each successive uninterrupted (i.e, by consumer 
target information in step 1537 and operation returns to step selection of a conflicting DVD control function) consumer 
1505. If, in step 1535, a chapter end command target has not 60 depression of a navigation button. If however, the consumer 
been authored, then operation continues in step 1545. next activates a displayed menu button, then the DVD- 

If, in step 1545, more chapters exist in the current movie, player will invoke menu button router PGC 13236. Once 
then the DVD player increments the chapter number in step invoked, menu button router PGC 1323b will set target 
1543 and operation returns to step 1523. If instead, in step parameters according to the author-selected connection for 
1545, no more chapters remain unplayed in the current 65 the activated button, and so on. 

movie, then the player suspends playback and (in some Attachment A attached hereto provides computer listings 

models) switches itself off. of preferred PGC abstractions source code according to the 
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invention. For clarity sake, compilation has already been 
completed. Stated alternatively, the no-ops initially included 
in the skeleton-form PGC layout structure have been 
replaced by indices and the indices have been resolved to 
source and target identifiers using the discussed compiler 
and compilation methods. 

As shown in attachment A, the preferred PGC abstrac- 
tions utilize a number of DVD player registers. According to 
the DVD specification, each DVD player includes 16 gen- 
eral purpose registers ("GPs"), and 20 system registers 
("SPs"). The GPs are functionally undefined and merely 
"available for use" by movie title control program PGCs. 
Conversely, the SPs have fully defined purposes consistent 
with DVD player operation and movie title control program 
interfacing. 

The preferred GPs utilization and corresponding naming 
conventions according to the invention are indicated in the 
following chart. As shown, PGC abstractions exclusively 
utilize only 5 GPs, leaving a maximized number of remain- 
ing GPs available for adding further capabilities. 



Register Referenced as 



Description 



GP10 Stream Select 



GP12 
GP13 
GP14 
GP15 

SP7 

SP8 



Target Movie Number 
Target Button Number 
Target Chapter Number 
Temporary Register 

Last Chapter Played 

Last Highlighted Button 



Bit 15 - Select audio stream on/off 
Bit 14 = Select subtitle stream on/off 
Bit 13 - Select angle stream on,off 
Bits 10-12 = Audio stream number 
Bits 7-9 = Angle stream number 
Bits 0-6 - Subtitle stream number 
Stored number = Movie number 
Stored number = Button number 
Stored number - Chapter number 
Stored number = value used with 
current PGC 

DVD player fills the register wLth the 
number of the last chapter played 
DVD player fills the register with the 
number of the last highlighted button 



10 



15 



20 



As illustrated by the register utilization chart, GPs are 
utilized by source PGC abstractions primarily for designat- 
ing (i.e. resolving an available connection to) target PGC 40 
abstractions and for passing to the targets parameters affect- 
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ing target operation. The GPs are further utilized by target 
PGC abstractions primarily for establishing, manipulating 
and recalling localized variables (i.e. relating to a currently 
executing PGC command set). 

For example, at a time prior to initiating playback of a 
chapter, a source PGC abstraction stores a value in GP10 
("stream select"). That value will later indicate to a target 
PGC which audio, subtitle and/or angle stream is to be 
selected for movie playback. A further example is that, at a 
time prior to routing control to a target PGC abstraction, a 
source PGC abstraction stores a target's designation in a 
combination of registers GP12 ("Movie Number") and 
GP14 ("Chapter Number") for a movie target or GP13 
("Button Number") for a menu target. Finally, PGC abstrac- 
tions preferably utilize GP15 to temporarily store values, 
typically for use within a current PGC operation. 

In most cases, only a portion of a given register ("register 
bits") are utilized, while conversely, a given register may be 
used for multiple purposes, as seen in the utilization of GP10 
in the register chart. Those skilled in the art will appreciate, 
given the discussion herein, that the preferred embodiment 
enables certain advantages. Among these are that a single 
register or register set can be designated in all cases for 
25 similar purposes, thereby minimizing complexities, the 
number of registers required and the number of commands 
required within a PGC without detrimentally affecting rout- 
ing or parameter passing flexibility. Similarly, operations 
required to parse register data containing multiple data 
30 values are not needed. Other arrangements consistent with 
the teachings of the invention however, are likely in view of 
other applications facilitated by these teachings and in 
accordance with the scope and spirit of the invention. 
While the present invention has been described herein 
35 with reference to a particular embodiment thereof, a latitude 
of modification, various changes and Substitutions are 
intended in the foregoing disclosure, and it will be appre- 
ciated that in some instances some features of the invention 
will be employed without a corresponding use of other 
features without departing from the spirit and scope of the 
invention as set forth. 



APPENDIX A: PGC ABSTRACTION SOURCE CODE 



Exemplary Included Movies/Menus 

2 Movies with 3 chapters each 
2 Menus with 4 Buttons each 
Exemplary Included Connections 



Connections Source 



Movie 1 : 
Movie 1 : 
Movie 1 : 
Movie 1 : 
Movie 2 : 
Movie 2 : 
Movie 2 : 
Movie 2 : 
Menu 1 : 
Menu 1 : 
Menu 1 : 
Menu 1 : 
Menu 2 : 
Menu 2 : 
Menu 2 : 
Menu 2 : 



Chapter 1 
Chapter 2 
Chapter 3 
Chapter 3 
Chapter 1 
Chapter 2 
Chapter 3 
Chapter 3 
Button 1 
Button 2 
Button 3 
Button 4 
Button 1 
Button 2 
Button 3 
Button 4 



Remote Key 
Remote Key 
Remote Key 
End 

Remote Key 
Remote Key 
Remote Key 
End 



Connections Target 



Menu V. 
Menu 1: 
Menu 1: 
Menu 1: 
Menu 2: 
Menu 2: 
Menu 2: 
Menu 2: 
Movie 1: 
Movie 1: 
Movie 1: 
Menu 2: 
Movie 2: 
Movie 2: 
Movie 2: 
Menu 1: 



Button 1 
Button 2 
Button 3 
Button 1 
Button 1 
Button 2 
Button 3 
Button 1 
Chapter 1 
Chapter 2 
Chapter 3 
Button 1 
Chapter 1 
Chapter 2 
Chapter 3 
Button 1 



11/14/2003, EAST Version: 1.4,1 



US 6,453,459 Bl 



29 



30 



-continued 



APPENDIX A: PGC ABSTRACTION SOURCE CODE 



PGC ABSTRACTION SOURCE CODE 



1. VIDEO MANAGER ("VMGM") Program Chains ("PGCs") 
First Play PGC 



PRE_CMD#1: 
PRE_CMD#2: 
PRE_CMD#3: 



Title PGC (PGC #1) 



Movl GPU, 
Movl GP12, 
JumpSS 



1 
1 

VMGM PGCI 2 



//Target - Chapter 1 

// Target » Movie 1 

// Jump To Movie Router 



PRE_CMD#1: Movl GP13, 1 
PRE_CMD#2: LinkPGCN PGCN « 

Movie Router PGC (VMGM PGC #2) 



PRE_CMD#1 
PRE_CMD#2: 
PRE_CMD#3 
PRE_CMD#4: 
PRE CMD#5 



Movl 
EQ 
Movl 
EQ 

JumpSS 



GPlS, 
GPlS, 
GPlS, 
GP15, 



1 



//Target - Button 1 
// Jump to Menu 1 



// Setup Comparision to 1 



GP12 JumpTT TTN - 1 // If Target Movie - goto Moviel 



Menu 1 Display PGC (VMGM PGC #3) 



2 

GP12 
First Play 



// Setup Comparision to 2 
JumpTT TTN =2 // If Target Movie - 2, goto Movie 2 
// Should never get here 



PRE_CMD#1: Mov GPlS, GP13 
PRE_CMD#2: Mull GPlS, 1024 
PRECMD#3: SetHLJTNN HLP - GPlS 
UZ.1/18 Menu 2 Display PGC (VMGM PGC #4) 
PRE CMD#1: Mov GPlS, GP13 
PRECMD#2: Mull GP15, 1024 
PRE CMD#3: SelHL__BTNN HLP - GPlS 
Menu 1 Button Router (VMGM PGC #5) 



PRE_CMD#1 
PRE_CMD#2 
PRE_CMD#3 
PRE_CMD#4 
PRE_CMD#5 
PRE_CMD#6 
PRE_CMD#7: 
PRE_GMD#8 
PRE_CMD#9 
PRE_CMD#10: 
PRE_CMD#11 
PRE_CMD#12: 
PRE_CMD#13; 
PRE_CMD#14: 
PRE__CMD#15: 
PRE_CMD#16: 
Menu 1 Button 



// Put target button no. into temp storage 
// Shift Button number by 10 Bits to the left 
// Highlight target button in menu; display menu 

// Put target button no. into temp storage 
// Shift Button number by 10 Bits to the left 
// Highlight target Button in menu; display menu 



Mov 


GPlS, 


SP8 




// Put last highlighted button in temp storage 


Divl 


GPlS, 


1024 




// Shift Button number 10 bits to the right 


NEI 


GPlS, 


1 GoTo CMDNUM - 


7 


// If button number * 1, goto CMD #7 


Movl 


GPU, 


1 




// Target - Chapter 1 


JumpTT 




TTN = 1 




// Jump Movie 1 


Nop 










NEI 


GP15, 


2 GoTo CMDNUM - 


11 


// If button number - 2, goto CMD #11 


Movl 


GPU, 


2 




// Target - Chapter 2 


JumpTT 




TTN - 1 




// Jump Movie 1 


Nop 










NEI 


GPlS, 


3 GoTo CMDNUM = 


15 


// If button number * 3, goto CMD #15 


Movl 


GP14, 


3 




// Target - Chapter 3 


JumpTT 




TTN- 1 




// Jump Movie 1 


Nop 










Movl 


GP13, 


1 




// Target - Button 1 


LinkPGCN 


PGCN - 4 




// Jump Menu 2 



Router (VMGM PGC #6) 



PRE_CMD#1 
PRE_CMD#2: 
PRE_CMD#3: 
PRE_CMD#4: 
PRE_CMD#5 
PRE_CMD#6 
PRE_CMD#7; 
PRE_CMD#8 
PRE_CMD#9 
PRE_CMD#10: 
PRE_CMD#11 
PRE_CMD#12: 
PRE_CMD#13; 
PRE_CMD#14: 
PRE_CMD#15 
PRE_CMD#16: 
Title 1 End Router 



Mov 


GPlS, 


SP8 




// Put last highlighted button in temp storage 


Divl 


GP15, 


1024 




// Shift Button number 10 bits to the right 


NEI 


GPlS, 


1 GoTo CMDNUM = 


7 


// If button number * 1, goto CMD #7 


Movl 


GPU, 


1 




//Target?- Chapter 1 


JumpTT 




TTN - 2 




// Jump Movie 2 


Nop 










NEI 


GPlS, 


2 GoTo CMDNUM - 


11 


// If button number •* 2, goto CMD #11 


Movl 


GPU, 


2 




// Target = Chapter 2 


JumpTT 




TTN - 2 




// Jump Movie 2 


Nop 










NEI 


GPlS, 


3 GoTo CMDNUM - 


15 


// If button number x 3, goto CMD #15 


Movl 


GPU, 


3 




// Target - Chapter 3 


JumpTT 




TTN = 2 




// Jump Movie 2 


Nop 










Movl 


GP13, 


1 




// Target - Button 1 


LinkPGCN 


PGCN - 3 




// Jump Menu 1 



Chapter 3 (VMGM PGC #7) 



PRE_CMD#1 : Movl GP13, 1 
PRE_CMD#2: LinkPGCN PGCN - 3 

Title 2 End Router Chapter 3 (VMGM PGC #8) 

PRE_CMD#1: Movl GP13, 1 
PRE_CMD#2: LinkPGCN PGCN = 4 

2. Video Tide Segment #1 ("VTS-1") PGCs 
Movie 1 Display PGC (VTS PGC #1) 



// Target » Button 1 
// Jump Menu 1 



// Target = Button 1 
// Jump Menu 2 



PRE_CMD#1: 
PRE_CMD#2: 



Mov 
Movl 



GP15, GPU 
GP14, 0 



// Move Target Chapter into temp storage 
// Zero Target Chapter register 
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-continued 



APPENDIX A: PGC ABSTRACTION SOURCE CODE 



PRE_CMD#3 
PRE_CMD#4 
PRE_CMD#5 



EQI 
EQI 
EQI 



GP15, 
OP15, 
GP15, 



UnkPGN PGN - 1 
LinkPGN PGN - 2 
LinkPGN PGN - 3 



End CMP for Movie 1 Chapter 3 (VTS PGC #1) 

C_CMD#1: CallSS VMGM_PGCN - 7, DOMAIND - 3 

Chapter Router PGC (VTSM PGC #1) 

PRE_CMD#1: Mov GP15, SP7 

PRE_CMD#2: Nop 

PRE_CMD#3: Nop 

PRE_CMD#4: Nop 

PRE_CMD#5: GEI GP15, 2 GoTo CMDNUM « 9 

PRE_CMD#6: Movl GP13, 1 

PRE_CMD#7: JumpSS VMGM_PGCN = 3, DOMAIND = 3 

PRE_CMD#8: Nop 

PRE_CMD#9: GEI GP15, 3 GoTo CMDNUM = 13 

PRE__CMD#10: Movl GP13, 2 

PRE_CMD#11: JumpSS VMGM_PGCN - 3, DOMAIND - 3 

PRE_CMD#12: Nop 

PRE_CMD#13: Movl GP13, 3 

PRE_CMD#14: JumpSS VMGM_PGCN = 3, DOMAIND - 3 
3. Video Tide Segment #2 ("VTS-2") PGCS 



PRE_CMD#1 
PRE_CMD#2: 
PRE_CMD#3 
PRE_CMD#4: 
PRE_CMD#5 



Mov 

Movl 

EQI 

EQI 

EQI 



GP15, GP14 
GP14, 0 



GP15, 
GP15, 
GP15 f 



LtnkPGN PGN = 1 
UnkPGN PGN - 2 
UntPGN PGN = 3 



End CMP for Movie 2 Chapter 3 (VTS PGC #1) 

C_CMD#1 : CallSS VMGM__PGCN » 7, DOMAIND = 3 

Chapter Router PGC (VTSM PGC #1) 



PRE_CMD#1: 
PRE_CMD#2: 
PRE_CMD#3: 
PRE_CMD#4: 
PRE_CMD#5: 
PRE_CMD#6: 
PRE_CMD#7: 
PRE_CMD#8: 
PRE_CMD#9: 
PRE_CMD#10: 
PRE_CMD#11: 
PRE_CMD#12: 
PRE_CMD#13: 
PRE_CMD#14: 



Mov GP15, SP7 

Nop 

Nop 

Nop 

GEI GP15, 2 GoTo CMDNUM - 9 
Movl GP13, 1 

JumpSS VMGM_PGCN = 4, DOMAIND « 3 
Nop 

GEI GP15, 3 GoTo CMDNUM - 13 
Movl GP13, 2 

JumpSS VMGM_PGCN = 4, DOMAIND - 3 
Nop 

Movl GP13, 3 

JumpSS VMGM_PGCN - 4, DOMAIND - 3 



// If Chapter-1, Goto Program #1 (Chapter 1) 
// If Chapter-2, Goto Program #2 (Chapter 2) 
// If Chapter-3, Goto Program #3 (Chapter 3) 



// Jump to VMGM PGC #7 (Titlel, End-router for Chapter 3) 
// Put last played chapter into temp storage 



// If last chapter £ 2, goto CMD #9 
// Target - Button 1 
// Jump to Menu 1 

// If last chapter ^ 3, goto CMD #13 
// Target = Button 2 
// Jump to Menu 1 

// Target - Button 3 
// Jump to Menu 1 

// Move Target Chapter into temp storage 
// Zero Target Chapter register 
// If Chapter-1 , Goto Program #1 (Chapter 1) 
// If Chapter-2, Goto Program Wl (Chapter 2) 
// If Chapter-3, Goto Program #3 (Chapter 3) 



// Jump to VMGM PGC #8 (Title 2 End-router for Chapter 3) 



// Put last played chapter into temp storage 



// If last chapter ^ 2, goto CMD #9 
// Target = Button 1 
// Jump to Menu 1 

// If last chapter ^ 3, goto CMD #13 
// Target - Button 2 
// Jump to Menu 2 

// Target » Button 3 
// Jump to Menu 2 



We claim: 

1. A method for compiling an authored DVD video 
program, the method comprising: 

providing an abstraction layer between menu buttons, 
movie chapters and connections therebetween, and 
interconnected PGCs, their instructions and their allo- 
cation within a DVD video space and domain structure; 

whereby an author of the DVD video program is able to 
author the DVD video program by referencing ele- 
ments of the abstaction layer rather than by referencing 
the interconnected PGCs, their instructions and their 
allocation within the DVD video space and domain 
structure. 

2. A system for compiling an authored multimedia pre- 
sentation to form a DVD video program, the system com- 
prising: 



means for providing an abstraction layer on top of inter- 
connected PGCs, their instructions and their allocation 
within a DVD video space and domain structure; and 

means for compiling elements of the abstraction Layer to 
generate DVT) program code and content. 

3. A system for compiling an authored multimedia pre- 
sentation to form a DVD video program, the system com- 
prising: 

means for authoring DVD program content; and 
means for interlinking content elements with one another 
using automatically generated dummy PGCs. 

4. A method for compiling an authored multimedia pre- 
sentation to form a DVD video program, the method com- 
prising establishing an abstracted reference to program code 
referenced in a DVD program before an absolute reference 
to the program code is known. 

* * * * * 
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